This weekend in Generative Media
Apple plans AI deals with publishers; AI ethics falls by the wayside; Midjourney V6
Apple Explores A.I. Deals With News Publishers (New York Times)
This week in AI: AI ethics keeps falling by the wayside (TechCrunch)
Midjourney V6 is here with in-image text and completely overhauled prompting (VentureBeat)
Midjourney will "find you and collect that money" if you infringe any IP with v6 (Decoder)
An artist fights back, and Midjourney has embarrassed themselves (Gary Marcus on Substack)
Creators, porn stars turn to AI doppelgangers to keep fans entertained (Washington Post)
Apple wants AI to run directly on its hardware instead of in the cloud (Ars Technica)
The Simulation by Fable open sources AI tool to power Westworlds of the future (VentureBeat)
Journalists Had 'No Idea' About OpenAI's Deal to Use Their Stories (Wired)
A ‘thirsty’ generative AI boom poses a growing problem for Big Tech (CNBC)
Wizards of the Coast doubles down on generative AI stance, says artists are ‘what makes D&D great’ (GeekWire)
Longtime gaming leader Jordan Weisman on his new AI-driven game development platform (GeekWire)
An AI Haunted World (Ethan Mollick on Substack)
AI-Generated Sci-Fi Novel Secures Prestigious Literary Prize in China (Medium)
The Move One iOS App makes it easy to capture and create 3D animations with just a phone (move.ai)
After working on my AI assisted free 2.5D point and click adventure game for a year, I am releasing a demo for PC and OS X! This demo shows how AI tools help a solo dev make a game and what it could be. I have the whole process in my blog (X) An experimental freeware 2.5D adventure game featuring AI-assisted graphics (Echoes of Somewhere)
Align Your Gaussians: Text-to-4D with Dynamic 3D Gaussians and Composed Diffusion Models (project page)
DreamTuner: Single Image is Enough for Subject Driven Generation (project page)
Splatter Image: Ultra-Fast Single-View 3D Reconstruction (project page)
pixelSplat: 3D Gaussian Splats from Image Pairs for Scalable Generalizable 3D Reconstruction (project page)
Repaint123: Fast and High-quality One Image to 3D Generation with Progressive Controllable 2D Repainting (project page)
Paint3D: Paint Anything 3D with Lighting-less Texture Diffusion Models (project page)
Intrinsic Image Diffusion for Single-view Material Estimation (project page)
Zero-Shot Metric Depth with a Field-of-View Conditioned Diffusion Model (project page)
Global Latent Neural Rendering (project page)
Fairy: Fast Parallelized Instruction-Guided Video-to-Video Synthesis (project page)
VidToMe: Video Token Merging (project page)
Adaptive Guidance: Training-free Acceleration of Conditional Diffusion Models (arXiv)
Analyzing and Improving the Training Dynamics of Diffusion Models (arXiv)
All-In-One Music Structure Analyzer (GitHub) Check out this excellent demo!
OpenLRM is an open-source implementation of Large Reconstruction Models.
Image-to-3D in 10 seconds! (HuggingFace)
I made a script that watches for new screenshots and renames them using Llava 13B: (X)