Generative AI

introduction

text2image generation

How does it work?

What do you [need|need to know] to get started?

Why Doesn't the AI understand me

frameworks

example

An AI robot teaching a classroom full of adults, drawn by Dali

example

An AI robot teaching a classroom full of adults, painted by Pieter Brueghel the Elder

How

does AI work?

it's complicated to explain, but easy to use

Compare with MP3 compression

now for AI

encoding step

latent space vector

Latent Space?

lets take a latent space walk

What is latent space

training on data


src link

train your own latent space

Stable diffusion example

  • lot's of data eg: the laion-5b dataset
    • 5.85 billion image-text pairs.
    • a hard drive of 240TB
  • 32 x 8 x A100 GPUs
  • cost: approx $ 600,000
  • carbon cost: 11,250 kg CO2
  • or ± 2.5 cars driving 15,000 km/year

dataset explorer

talking about data

AI model represent trainings-data

trainings-data...

  • represents digital society
  • represents many artists
  • cleaned for big audience (NSFW, violence, ...)
  • ≠ (version of) model, ≠ dataset

lets encode some text too

latent space vector

features visualisation

Remember this one?

right, a decoder is missing!

latent diffusion model

type prompt

get image

What

do you need to know to get started

Prompt Engineering

A prompt consists of :

  • 1. A (main) topic
  • 2. an environment
  • 3. details
  • 4. atmosphere and context of the scene
  • 5. style (artist, medium)

positive prompt

prompt: cyberpunk forest by Salvador Dali

img credit Stable Diffusion 2.0

negative prompt

prompt: cyberpunk forest by Salvador Dali

negative prompt: trees, green

img credit Stable Diffusion 2.0

Help compositing a prompt?

phraser.tech/builder

The seed

The problem modern computers have with randomness is that it doesn’t make mathematical sense.

The seed

A number to generate a pseudo-random noise image

The seed

Base prompt
Smiling
Angry
Excited

WHY

the AI doesn't understand me

some tips

strange trees and weird heads

Compare

redditpost

look for trouble

Learn

comic diffusion model

tweak

Complete this image in a way that proves you won’t be replaced by AI

using AI is allowed

twitterthread

Frameworks

a personal view
  • Dalle-E 2
  • Adobe Firefly
  • Midjourney
  • Stable Diffusion

Dall-E 2

dalle
pro con
  • kickoff
  • OpenAI is not open
  • not clear which data
  • model updates?

Adobe Firefly

adobe
pro con
  • new kid
  • first integrated one
  • not yet en par

Midjourney

MJ
pro con
  • easy: beautiful visuals
  • model updates
  • discord: community tool
  • images are public
  • images are public
  • only available through discord
  • artist copyright infringment

Stable Diffusion

pro con
  • open source community
  • expandability: animation, video, controlnet
  • artist-opt-out
  • train model variants
  • run locally
  • not as easy as Midjourney
    • to create beautiful images
    • to use all features
  • not integrated

start exploring!

kasper.jordaens@luca-arts.be

wouter.devriese@luca-arts.be

a guy standing on a beach, waving towards us, has a tuxedo suit and shorts.