The "Human" Element in a Generative World
For past some weeks, I have spent some time trying Image Generation tools like Nano Banana and ChatGPT Images, generating hyper-realistic portraits. In seconds, I got perfect lighting, flawless subjects, and zero friction. The technical barriers to creating a "beautiful" image are completely gone. And yet, this morning, I still picked up my smartphone and walked out into the heat. Why do we still deal with bad lighting, and the frustration of missing a fleeting frame? It seems almost stupid when you can generate infinite perfect worlds without leaving your desk.
Talking about memory: the actual value of photography has shifted. It's no longer about documenting reality (we have phone cameras for that anyway). It's about witnessing it. When I generate an image, I'm the director. I control the lighting, the subject, the mood. I'm a mini-god in a clean digital universe. But with a camera, I'm just an observer subject to real-world chaos. The light might be terrible, the subject might blink, or a random auto rickshaw might block the shot. But that lack of control is where the actual humanity is. Some people won't get why that matters, and that's alright.
"Photography is a way of feeling, of touching, of loving. What you have caught on film is captured forever... it remembers little things, long after you have forgotten everything." - Aaron Siskind
To be honest, the friction is the entire point. There's a real, physical effort in manual photography. You have to walk around to change perspective, wait patiently for the clouds to clear, and decide to press the shutter now (and not a millisecond later). This friction forces us to actually look at the world, not just imagine it. A model can generate a flawless sunset, but it won't ever capture the cold breeze on your face while you stood there watching the sun go down.
At the end of the day, generative AI is about imagination, but photography is about perception. They aren't enemies, but they're completely different. One looks inward, pulling from standard internet datasets to synthesize something new. The other looks outward, framing a real slice of the world. In a world flooded with clean, AI-generated pixels, the human flaws (the shaky hands, the film grain, the slightly missed focus) become the actual signature of authenticity. It's proof that you were there and saw it happen. No diffusion model can replicate that testimony.
I honestly hope we never lose the urge to step outside and capture the world as it is, even if it's imperfect.