diff --git a/talkinghead/README.md b/talkinghead/README.md index e1c93a7..691940b 100644 --- a/talkinghead/README.md +++ b/talkinghead/README.md @@ -224,9 +224,10 @@ The following postprocessing filters are available. Options for each filter are - `analog_lowres`: Simulates a low-resolution analog video signal by blurring the image. - `analog_badhsync`: Simulates bad horizontal synchronization (hsync) of an analog video signal, causing a wavy effect that causes the outline of the character to ripple. +- `analog_distort`: Simulates a rippling, runaway hsync near the top or bottom edge of an image. This can happen with some equipment if the video cable is too long. - `analog_vhsglitches`: Simulates a damaged 1980s VHS tape. In each 25 FPS frame, causes random lines to glitch with VHS noise. - `analog_vhstracking`: Simulates a 1980s VHS tape with bad tracking. The image floats up and down, and a band of VHS noise appears at the bottom. -- `shift_distort`: Simulates a glitchy digital video transport as sometimes depicted in sci-fi, with random blocks of lines shifted horizontally. +- `shift_distort`: A glitchy digital video transport as sometimes depicted in sci-fi, with random blocks of lines suddenly shifted horizontally temporarily. **Display**: diff --git a/talkinghead/TODO.md b/talkinghead/TODO.md index 5891a31..c19a791 100644 --- a/talkinghead/TODO.md +++ b/talkinghead/TODO.md @@ -63,20 +63,13 @@ Not scheduled for now. - The effect on speed will be small; the compute-heaviest part is the inference of the THA3 deep-learning model. - Add more postprocessing filters. Possible ideas, no guarantee I'll ever get around to them: - Pixelize, posterize (8-bit look) - - Analog video glitches - - Partition image into bands, move some left/right temporarily (for a few frames now that we can do that) - - Another effect of bad VHS hsync: dynamic "bending" effect near top edge: - - Distortion by horizontal movement - - Topmost row of pixels moves the most, then a smoothly decaying offset profile as a function of height (decaying to zero at maybe 20% of image height, measured from the top) - - The maximum offset flutters dynamically in a semi-regular, semi-unpredictable manner (use a superposition of three sine waves at different frequencies, as functions of time) - Digital data connection glitches - Apply to random rectangles; may need to persist for a few frames to animate and/or make them more noticeable - - May need to protect important regions like the character's head (approximately, from the template); we're after "Hollywood glitchy", not actually glitchy - Types: - Constant-color rectangle - Missing data (zero out the alpha?) - Blur (leads to replacing by average color, with controllable sigma) - - Zigzag deformation + - Zigzag deformation (perhaps not needed now that we have `shift_distort`, which is similar, but with a rectangular shape, and applied to full lines of video) - Investigate if some particular emotions could use a small random per-frame oscillation applied to "iris_small", for that anime "intense emotion" effect (since THA3 doesn't have a morph specifically for the specular reflections in the eyes). diff --git a/talkinghead/tha3/app/postprocessor.py b/talkinghead/tha3/app/postprocessor.py index 6e08164..c835aa2 100644 --- a/talkinghead/tha3/app/postprocessor.py +++ b/talkinghead/tha3/app/postprocessor.py @@ -495,6 +495,70 @@ class Postprocessor: warped = warped.squeeze(0) # [1, c, h, w] -> [c, h, w] image[:, :, :] = warped + def analog_distort(self, image: torch.tensor, *, + speed: float = 8.0, + strength: float = 0.1, + ripple_amplitude: float = 0.05, + ripple_density1: float = 4.0, + ripple_density2: Optional[float] = 13.0, + ripple_density3: Optional[float] = 27.0, + edge: str = "top") -> None: + """[dynamic] Analog video signal distorted by a runaway hsync near the top or bottom edge. + + A bad video cable connection can do this, e.g. when connecting a game console to a display + with an analog YPbPr component cable 10m in length. In reality, when I ran into this phenomenon, + the distortion only occurred for near-white images, but as glitch art, it looks better if it's + always applied at full strength. + + `speed`: At speed 1.0, a full cycle of the rippling effect completes every `image_height` frames. + So effectively the cycle position updates by `speed * (1 / image_height)` at each frame. + `strength`: Base strength for maximum distortion at the edge of the image. + In units where the height and width of the image are both 2.0. + `ripple_amplitude`: Variation on top of `strength`. + `ripple_density1`: Like `density` in `analog_badhsync`, but in time. How many cycles the first + component wave completes per one cycle of the ripple effect. + `ripple_density2`: Like `ripple_density1`, but for the second component wave. + Set to `None` or to 0.0 to disable the second component wave. + `ripple_density3`: Like `ripple_density1`, but for the third component wave. + Set to `None` or to 0.0 to disable the third component wave. + `edge`: one of "top", "bottom". Near which edge of the image to apply the maximal distortion. + The distortion then decays to zero, with a quadratic profile, in 1/8 of the image height. + + Note that "frame" here refers to the normalized frame number, at a reference of 25 FPS. + """ + c, h, w = image.shape + + # Animation + # FPS correction happens automatically, because `frame_no` is normalized to CALIBRATION_FPS. + cycle_pos = (self.frame_no / h) * speed + cycle_pos = cycle_pos - float(int(cycle_pos)) # fractional part + cycle_pos *= 2.0 # full cycle = 2 units + + # Deformation + # The spatial distort profile is a quadratic curve [0, 1], for 1/8 of the image height. + meshy = self._meshy + if edge == "top": + spatial_distort_profile = (torch.clamp(meshy + 0.75, max=0.0) * 4.0)**2 # distort near y = -1 + else: # edge == "bottom": + spatial_distort_profile = (torch.clamp(meshy - 0.75, min=0.0) * 4.0)**2 # distort near y = +1 + ripple_amplitude = ripple_amplitude + ripple = math.sin(ripple_density1 * cycle_pos * math.pi) + if ripple_density2: + ripple += math.sin(ripple_density2 * cycle_pos * math.pi) + if ripple_density3: + ripple += math.sin(ripple_density3 * cycle_pos * math.pi) + instantaneous_strength = (1.0 - ripple_amplitude) * strength + ripple_amplitude * ripple + # The minus sign: read coordinates toward the left -> shift the image toward the right. + meshx = self._meshx - instantaneous_strength * spatial_distort_profile + + # Then just the usual incantation for applying a geometric distortion in Torch: + grid = torch.stack((meshx, meshy), 2) + grid = grid.unsqueeze(0) # batch of one + image_batch = image.unsqueeze(0) # batch of one -> [1, c, h, w] + warped = torch.nn.functional.grid_sample(image_batch, grid, mode="bilinear", padding_mode="border", align_corners=False) + warped = warped.squeeze(0) # [1, c, h, w] -> [c, h, w] + image[:, :, :] = warped + def _vhs_noise(self, image: torch.tensor, *, height: int) -> torch.tensor: """Generate a horizontal band of noise that looks as if it came from a blank VHS tape.