Altered Present Metronom

Exploring digital presence as a collective condition, using AI to reconfigure how bodies relate to space and to each other.

AdvancedWork in progress50

Things used in this project

Hardware components

Apple iPhone

Using smartphone to capture realtime video input using built-in camera.

Laptop

Using a laptop for running softwares listed below.

Software apps and online services

TouchDesigner

Use TouchDesigner to: - translate video input https://derivative.ca/UserGuide/Video_Device_Out_TOP - generate 3D object: https://derivative.ca/UserGuide/Sphere_SOP + https://derivative.ca/UserGuide/Transform_SOP - transform and modify video https://derivative.ca/UserGuide/Edge_TOP + https://derivative.ca/UserGuide/Level_TOP - sending video input to DayDream realtime AI generating - generating metronome sound https://derivative.ca/UserGuide/Beat_CHOP + https://derivative.ca/UserGuide/Audio_Oscillator_CHOP - merging outgoing video ouputs https://derivative.ca/UserGuide/Video_Device_Out_TOP

Camo

Using Camo as a desktop client and mobile app to create wireless connection sending realtime camera footage https://camo.com/camera

DayDream

A platform with an API using generative AI capabilities generating audiovisual content https://docs.daydream.live/introduction

StreamDiffusion

A pre-built integration between DayDream and TouchDesigner via custom API https://docs.daydream.live/scope/reference/pipelines/streamdiffusion-v2#streamdiffusion-v2

Bambu Studio

Hand tools and fabrication machines

Bambu Lab A1

Story

Can an alternative digital presense help people become more connected to their physical environment by movement?

In contemporary urban life, digital presence is mostly experienced as a form of absence. People inhabit public spaces while being perceptually elsewhere; scrolling, messaging. Non-places such as transit zones, corridors, and waiting areas amplify this condition: spaces designed for passing through rather than being in.

Rather than treating digital systems as tools for information delivery or efficiency, this project explores digital presence as a perceptual and affective mediator, a layer that can redistribute attention, perception, and bodily awareness. The installation proposes a speculative interface; a metronom-like holder where participants are invited to put their phones. The input from the real-time camera is processed via TouchDesigner and an AI prompting; the environment becomes a jungle-like landscape, human figures remain visible but rendered as outlines, and the centered sphere presents a fisheye reality. The jungle functions not as an escape but as a speculative ecology, contrasting the anonymity of non-places with an imagined environment of density, attention, and co-presence.

Participants become aware not only of their own presence, but of others’ presence as moving bodies outlined within the same altered reality. Rather than asking users to disconnect from their devices, the project reframes the device itself as a choreographic apparatus. Digital presence here is not about representation or immersion, but about re-imagination using algorithmic transformation to make the act of being in space visible, strange, and shared. In that sense, the project doesn't try to solve or fix digital distraction. Instead, it performs a critical gesture; proposing that digital systems could function as infrastructuresthat capable of producing new forms of relational presence between bodies, technologies, and environments.

1 / 3 • final 3D print

footage: https://vimeo.com/1162812965?share=copy&fl=sv&fe=ci

Key learnings & considerations:

Current solution requires multiple hardware and software elements; it must be considered as a functional prototype. Further standardization and consolidation work are needed for end-user utilisation.
Model prompting is limited; for example, making the model not generate something or limiting the generation scope was unsuccessful. Therefore, we had to take a tradeoff and have a visual module showing the human bodies from the captured environment.

Code

import numpy as np
import torch

# You may need to adapt imports depending on the StreamDiffusion version you installed.
from diffusers import StableDiffusionImg2ImgPipeline
from streamdiffusion import StreamDiffusion

class StreamDiffusionEngine:
    def __init__(self):
        self.device = "cuda" if torch.cuda.is_available() else "cpu"
        self.dtype  = torch.float16 if self.device == "cuda" else torch.float32

        self.pipe = None
        self.stream = None
        self.ready = False

        # defaults
        self.prompt = "a surreal metronome pendulum, cinematic lighting"
        self.negative_prompt = "lowres, blurry, bad anatomy"
        self.cfg = 1.2
        self.steps = 4
        self.seed = 1234

    def setup(self, model_id="runwayml/stable-diffusion-v1-5"):
        # Load img2img pipeline
        self.pipe = StableDiffusionImg2ImgPipeline.from_pretrained(
            model_id,
            torch_dtype=self.dtype,
            safety_checker=None,
            requires_safety_checker=False,
        ).to(self.device)

        # Build StreamDiffusion wrapper
        self.stream = StreamDiffusion(
            pipe=self.pipe,
            t_index_list=list(range(self.steps)),  # simple default; tune for quality/speed
            torch_dtype=self.dtype,
        )

        # Prepare once (important for speed)
        g = torch.Generator(device=self.device).manual_seed(self.seed)
        self.stream.prepare(
            prompt=self.prompt,
            negative_prompt=self.negative_prompt,
            num_inference_steps=self.steps,
            guidance_scale=self.cfg,
            generator=g,
        )

        self.ready = True

    @torch.inference_mode()
    def process_rgba(self, rgba_uint8: np.ndarray):
        """
        rgba_uint8: HxWx4 uint8 (TouchDesigner-style)
        returns:    HxWx4 uint8
        """
        if not self.ready:
            return rgba_uint8

        # Convert to torch image (HWC uint8 -> float tensor)
        img = torch.from_numpy(rgba_uint8[..., :3]).to(self.device)  # RGB
        img = img.permute(2, 0, 1).unsqueeze(0).to(dtype=self.dtype) / 255.0  # 1x3xHxW

        out = self.stream(img)  # StreamDiffusion forward (returns torch tensor image)

        # Convert back to uint8 RGBA
        out = out.squeeze(0).permute(1, 2, 0).clamp(0, 1)  # HxWx3
        out = (out * 255.0).to(torch.uint8).detach().cpu().numpy()

        rgba = np.concatenate([out, 255 * np.ones((*out.shape[:2], 1), dtype=np.uint8)], axis=2)
        return rgba

ENGINE = StreamDiffusionEngine()

Credits

Sıla Kara

1 project • 0 followers

Armin Gulbert

1 project • 0 followers

Altered Present Metronom