Sıla KaraArmin Gulbert
Published

Altered Present Metronom

Exploring digital presence as a collective condition, using AI to reconfigure how bodies relate to space and to each other.

AdvancedWork in progress50
Altered Present Metronom

Things used in this project

Hardware components

iPhone
Apple iPhone
Using smartphone to capture realtime video input using built-in camera.
×1
Laptop
Using a laptop for running softwares listed below.
×1

Software apps and online services

TouchDesigner
Use TouchDesigner to: - translate video input https://derivative.ca/UserGuide/Video_Device_Out_TOP - generate 3D object: https://derivative.ca/UserGuide/Sphere_SOP + https://derivative.ca/UserGuide/Transform_SOP - transform and modify video https://derivative.ca/UserGuide/Edge_TOP + https://derivative.ca/UserGuide/Level_TOP - sending video input to DayDream realtime AI generating - generating metronome sound https://derivative.ca/UserGuide/Beat_CHOP + https://derivative.ca/UserGuide/Audio_Oscillator_CHOP - merging outgoing video ouputs https://derivative.ca/UserGuide/Video_Device_Out_TOP
Camo
Using Camo as a desktop client and mobile app to create wireless connection sending realtime camera footage https://camo.com/camera
DayDream
A platform with an API using generative AI capabilities generating audiovisual content https://docs.daydream.live/introduction
StreamDiffusion
A pre-built integration between DayDream and TouchDesigner via custom API https://docs.daydream.live/scope/reference/pipelines/streamdiffusion-v2#streamdiffusion-v2
Bambu Studio

Hand tools and fabrication machines

Bambu Lab A1

Story

Read more

Schematics

High-level architecture

An architectural overview of the solution covering hardware and software components with data flows.

High-level concept

High-level code overview TouchDesigner

Code

Metronome code

Python
v = op('trig1')['sine']
math.copysign(abs(v)**1.6, v)


LFO (Ramp 01 per beat)
 Math CHOP (×360)
 Trig CHOP (use Sine output)
 Math CHOP (× swingAmount)
 NULL (export this)

StreamDiffusion Code SAMPLE

Python
import numpy as np
import torch

# You may need to adapt imports depending on the StreamDiffusion version you installed.
from diffusers import StableDiffusionImg2ImgPipeline
from streamdiffusion import StreamDiffusion

class StreamDiffusionEngine:
    def __init__(self):
        self.device = "cuda" if torch.cuda.is_available() else "cpu"
        self.dtype  = torch.float16 if self.device == "cuda" else torch.float32

        self.pipe = None
        self.stream = None
        self.ready = False

        # defaults
        self.prompt = "a surreal metronome pendulum, cinematic lighting"
        self.negative_prompt = "lowres, blurry, bad anatomy"
        self.cfg = 1.2
        self.steps = 4
        self.seed = 1234

    def setup(self, model_id="runwayml/stable-diffusion-v1-5"):
        # Load img2img pipeline
        self.pipe = StableDiffusionImg2ImgPipeline.from_pretrained(
            model_id,
            torch_dtype=self.dtype,
            safety_checker=None,
            requires_safety_checker=False,
        ).to(self.device)

        # Build StreamDiffusion wrapper
        self.stream = StreamDiffusion(
            pipe=self.pipe,
            t_index_list=list(range(self.steps)),  # simple default; tune for quality/speed
            torch_dtype=self.dtype,
        )

        # Prepare once (important for speed)
        g = torch.Generator(device=self.device).manual_seed(self.seed)
        self.stream.prepare(
            prompt=self.prompt,
            negative_prompt=self.negative_prompt,
            num_inference_steps=self.steps,
            guidance_scale=self.cfg,
            generator=g,
        )

        self.ready = True

    @torch.inference_mode()
    def process_rgba(self, rgba_uint8: np.ndarray):
        """
        rgba_uint8: HxWx4 uint8 (TouchDesigner-style)
        returns:    HxWx4 uint8
        """
        if not self.ready:
            return rgba_uint8

        # Convert to torch image (HWC uint8 -> float tensor)
        img = torch.from_numpy(rgba_uint8[..., :3]).to(self.device)  # RGB
        img = img.permute(2, 0, 1).unsqueeze(0).to(dtype=self.dtype) / 255.0  # 1x3xHxW

        out = self.stream(img)  # StreamDiffusion forward (returns torch tensor image)

        # Convert back to uint8 RGBA
        out = out.squeeze(0).permute(1, 2, 0).clamp(0, 1)  # HxWx3
        out = (out * 255.0).to(torch.uint8).detach().cpu().numpy()

        rgba = np.concatenate([out, 255 * np.ones((*out.shape[:2], 1), dtype=np.uint8)], axis=2)
        return rgba

ENGINE = StreamDiffusionEngine()

Credits

Sıla Kara
1 project • 0 followers
Armin Gulbert
1 project • 0 followers

Comments