AI Art - Tvorba za pomocí umělé inteligence: Midjourney, DALL·E 2, Stable Diffusion, OpenAI

Vše ohledně umění tvořeného pomocí umělé inteligence - obrázky, 'fotky', galerie, hudba, video, text + články, novinky apod.

NSFW obsah prosím obalit spoiler tagem - <div class="spoiler">obrázek</spoiler>, a nemá to tu být klub na roštěnky a nahotinky bez nějaké přidané hodnoty

Příbuzné diskuze:
- [DALL·E mini je mrtvé, Dejve, ať žije Craiyon, mage.space, Stable Diffusion atd. - having sex with AI since [date format unknown]]
- [I Hope This Does Not Exist ▌ Vedlejší efekty v AI visuálech]
- AI obecně [Artificial Intelligence AI]
- Vtipy [Umělá inteligence, chatboti - vtipné konverzace aneb "Hoří hovno?"]
- [generativní modely] Jak konstruovat prompty, kde získat váhy i jak to vše interpretovat

Prosím zkusme pro vkládanou tvorbu používat tagy
#galerie (2-3 obrázky na ukázku, další po rozkliku) #obrázek #video #hudba #text #hry #původní (pro vlastní tvorbu) #roštěnky

#článek #nástroj

(návrhy na další tagy apod. vítány)

Texty, programování: https://beta.openai.com/playground | https://chat.openai.com/
Obrázky online: https://www.midjourney.com/ | https://beta.dreamstudio.ai/
Lokálně: Webui-Forge https://github.com/lllyasviel/stable-diffusion-webui-forge | ComfyUI https://github.com/comfyanonymous/ComfyUI | Civitai repository custom modelů pro SD
AI na vytvoření textového zadání z existujícího obrázku: https://huggingface.co/spaces/pharma/CLIP-Interrogator

rozbalit záhlaví

KERRAY --- --- 22:54:42 5.6.2025

1 odpověď +1

#nástroj #tts Meet Eleven v3 - The most expressive Text to Speech model.
Eleven v3 (alpha) — The most expressive Text to Speech model
https://elevenlabs.io/v3

THEODORT --- --- 21:21:09 2.5.2025

LOJZA: https://github.com/coqui-ai/TTS
Možná když to zkopirujes 2x za sebe.. cca
GitHub - coqui-ai/TTS: 🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
https://github.com/coqui-ai/TTS#example-voice-cloning-together-with-the-voice-conversion-model

KERRAY --- --- 19:27:06 1.5.2025

další #původní #midjourney #galerie id STRYX
Jan Dřevíkovský - Midjourney potichu zmenila model 7 za...
https://www.facebook.com/share/p/1ANzz7ZTRS/

KILLUA --- --- 16:42:20 25.4.2025

PRAASHEK: Nový obrázkový model GPT. Prompt ti moc nepomůže tohle sem "modeloval" na základě asi 20 iterací. Musel sem si uvědomit věci jako osvětlení ze shora, ambientní tmu a pomalu měnit a vylepšovat iterováním a zkoušením.
Mám hlavně radost, že to opravdu vypadá jako "Avatar" nějakého démona v jeho chrámu a snad to jako opravdu nahání trochu strach když si představíš že tam stojíš a ono to začne mluvit...

Začal sem s tímhle.
In a shadowy cavern, demonic face looms. The jagged rock walls surrounding it echo the statue's malevolent presence, while two mysterious doorways stand as silent sentinels to either side, and a weathered stone pedestal rests nearby, bathed in the ambient light.

KERRAY --- --- 12:49:09 5.4.2025

1 odpověď

#nástroj #midjourney začaly testy v7 a "draft mode" #

We're now beginning to alpha-test a new V7 image Model. It's our smartest, most beautiful, and most coherent model yet. Give it a shot by using --v 7

V7 is the first model to have model personalization turned on by default. You must unlock your personalization to use it. This takes ~5 minutes. You can toggle it on/off at any time.

"Draft Mode" is another new feature. Draft mode is half the cost, 10 times the speed, and it might be the best way to iterate on ideas ever. Try it with voice on the web, or by using --draft on Discord.

BAT --- --- 21:34:21 28.3.2025

DOUPLE: Me by pro zajimavost zajimalo, jak by to dopadlo, kdyby byl model natrenovany na obrazkach nakreslenych v tom stylu, ale jinym autorem, jenz by s uzitim pro trenovani AI (a dalsim, s tim souvisejicim pouzitim) souhlasil.

Nic o tom nevim, ale divil bych se, kdyby byl "styl" nejak chraneny.
Jakoze chapu, ze nemuzu nikde publikovat postavicku Mickey Mouse (mimo tech opravdu starych co jsou snad uz v PD), ale kdyz stejnym stylem nakreslim postavicku co jsem si ja vymyslel (Roger Rat), tak...
no, asi by to dopadlo tak, ze by se Disney stejne soudil, a kdo na to ma, ze... ale teoreticky by to melo byt v poho ne?

Pominme fakt, ze tvuj priklad s politickou stranou by byl asi vniman par lidmi negativne, ale v ramci autorskych zakonu a kyho certa? to by asi bylo cajk, ne?

DOUPLE --- --- 20:39:38 28.3.2025

1 odpověď

DOUPLE: :)
Chatgpt: Styl studia Ghibli je natolik specifický a charakteristický, že když AI vygeneruje něco podobného, už to překračuje hranici mezi "inspirací" a "napodobováním". A pokud politická strana použije takový obrázek ve své kampani, může to být vnímáno jako zneužití estetického jazyka Ghibli k účelům, které s jejich tvorbou nemají nic společného.

Tohle je přesně ten typ situace, kde se etika a autorské právo setkávají s limity AI generace. Technicky vzato by se Ghibli mohlo bránit, pokud by dokázali, že AI model byl natrénován na jejich dílech nebo že výstup je příliš blízký jejich stylu. A i kdyby právně nešlo o přímé porušení autorských práv, pořád je tu otázka etiky a respektu k umělcům.

Osobně mi to taky nepřijde moc košer. Přijde mi to podobné, jako kdyby někdo vzal Miyazakiho film, přemaloval ho AI a pak ho použil v kampani, aniž by k tomu měl svolení. Je zajímavé, že si tohle dovolí například i politická strana – obvykle bývají dost opatrné, aby se vyhnuly právním problémům. Možná si neuvědomili, jak kontroverzní to může být.

KILLUA --- --- 23:08:08 3.3.2025

THEODORT: Nezkoušel, protože obecně nemám "resoning" modly tak rád, protože mám rád rychlou odezvu. Používám max o3 když chci poradit s něčím těžkým. Ale QwQ má zajímavé nápady když brainstormuju něco mimo programování, připadá mi že je jednoduše unikátní.

Deep seek sem zkoušel i plnej ale nezačal jsem ho používat nějak více.

ICE: Jako je to spíš jednodušší model, ale snaží se no :)

KILLUA --- --- 17:26:39 28.2.2025

3 odpovědi

BAT: Za mě je lokálně dobře použitelnej 32B QwQ, je to reasoning model a v programování docela silnej. Ale co sem pochopil ty nemáš dost ramky na něj. Proto sem doporučil ten 14B deep seek, protože je to taky reasoning model a je to postavené na 14B Qwen modelu :)
Vlastně jen naučili qwen model přemýšlet.

Zkoušel sem ještě nejnovější 24B mistral small 2501 a narozdíl od své starší verze a nejspíš i codestralu je za mě o dost lepší.

Ale to víš na Clauda to úplně nemá.

DOUPLE --- --- 15:38:51 4.8.2024

1 odpověď +7

par dni stary model Flux umi neuveritelny veci s textem
FLUX: This new AI image generator is eerily good at creating human hands | Ars Technica
https://arstechnica.com/information-technology/2024/08/flux-this-new-ai-image-generator-is-eerily-good-at-creating-human-hands/

THERIDANE --- --- 15:24:28 26.7.2024

KILLUA: proto mají abliterované/ortogonalizované modely vyšší score než originální, odstraní se ten zmatek a model pak dělá méně chyb (ale řekne opravdu všechno, co ví, a neodmítá)

KILLUA --- --- 14:44:46 26.7.2024

1 odpověď +3

HONZA09: Mistral má oproti velkým hráčům malé zdroje takže se snaží udělat svůj model konkurenceschopný všemi prostředky a jeden z nejsilnějších je co nejméně ho cenzurovat.

Když se nad tím totiž zamyslíš model se něco naučí je to pro něj "pravda" no a pak přijde ten "tréning" kde mu říkáš že to pravda není... výsledek je že v tom je nějakej "šum" zmatek.

KERRAY --- --- 10:19:55 13.6.2024

#nástroj #midjourney
Hey! We're releasing an early test version of model personalization today, here's how it works

What is Model Personalization?
Every time you write a prompt there's a lot that remains 'unspoken'. Our algorithms usually fill in the blank with their own 'preferences', which are really the combined biases and preferences of our community.
But of course everyone is different! Model personalization learns what you like so that it's more likely to fill in the blanks with your tastes.

Model Personalization Requirements
Right now model personalization learns from votes in pair ranking and images that you like from the explore page.
You need to have roughly 200 pair rankings / likes in order for the feature to work
You can see how many ratings you have on the above page or by typing /info

How do I use Model Personalization?
Just type --p after your prompt, or turn on personalization for all prompts under /settings or the prompt settings button on the website
When you enable personalization you get a 'code' added after your prompts, you can share this code to share the personalization effect being applied to that image
You can control the strength of the personalization effect by using --s 100 (0 is off and 1000 is maximum and 100 is default)

PLEASE NOTE: The personalization is not a stable feature right now, it will change as you do more pair ratings (subtly) and we may be pushing out algorithm updates over the next few weeks. So just assume that it's a 'fun feature constantly in flux'

This is a totally new frontier of image synthesis.

ARAON --- --- 8:26:43 7.6.2024

Sora by OpenAI is insane.

But KWAI just dropped a Sora-like model called KLING, and people are going crazy over it.

Here are 10 wild examples you don't want to miss:

1. A Chinese man sits at a table and eats noodles with chopsticks
x.com
https://x.com/i/status/1798777783952527818

GALADAR --- --- 13:55:35 21.5.2024

1 odpověď +1

HONZA09: Když narážíš na content restrictions, model tě upozorní. (a rád i sám poradí, jak je obejít) Tohle je jen chyba v promptu. Mmch "angel dust" už je direct drug reference, dívím se, že tě nestopl. Doporučuji flour, white powder, powdered caffeine atd.

HONZA09 --- --- 10:25:16 21.5.2024

3 odpovědi +1

Tak jsem našel omezení. nejsem za boha schopnej v Dall E vygenerovat obrázek boha dávajícího si lajnu. Ten generativní model záměrně igonruje přesný instrukce, že ten bůh má mít to brčko v nose.

A depiction of a god inhaling the Milky Way through a straw into their nose, as if snorting a line of angel dust. The scene is high contrast, resembling the style of a tapestry. The god, adorned in majestic and detailed attire, is shown inhaling the glittering Milky Way through the straw, with cosmic elements swirling around them. The background features a vast, star-studded galaxy with vivid colors and intricate patterns, enhancing the grandeur of the universe. The format is wide to capture the expansiveness of the scene.

KILLUA --- --- 16:43:02 29.3.2024

AODHFIN: A funguje tak i člověk, pokud tě po narození zavřou do bílé krabice a nebude s tebou nikdo nijak 20 let interagovat max ti tam spadne kostka s jídlem tak jediná informační hodnota vzejde z naprogramovaných DNA vzorců typu dávivý reflex, strach a smyslových podnětů ze tvého vlastního těla tj např z prohlížení vlastních rukou mohou vzejít nějaké představy.

Jinak nemá mozek ani ai model z čeho brát...

FRK_R23 --- --- 17:59:30 7.3.2024

Dracula :)

Base model: https://civitai.com/models/101055?modelVersionId=128078
LoRA: https://civitai.com/models/187625?modelVersionId=210701

FRK_R23 --- --- 13:09:06 5.3.2024

Zkoušel jsem jestli Fooocus umí taky třeba metahuman render :)

prompt: Unreal engine metahuman, unreal engine render, lumen
Base model: juggernautXL_v8

INK_FLO --- --- 15:18:11 3.3.2024

-1

The Age of Noise - by Eryk Salvaggio - Cybernetic Forests
https://cyberneticforests.substack.com/p/the-age-of-noise

Noise is a slippery word. It means both the presence and absence of information. Today it's in the urbanisation of our world, the hum of traffic and jet engines. Noise is also where we go to escape noise. In August of 2023, Spotify announced that users had listened to 3 million hours of white noise recordings. Noise to sleep to, noise to drown out noise. Noise is also the mental cacophony of data, on social media, of smartphones, and the algorithmic spectacle. The age of noise is a logical conclusion, a successful ending for the information age. And information, which was once scarce, is now spilling from the seams of our fibre optic cables and airwaves. The information age is over. Now we enter the age of noise. We can pin the information age to the invention of the transistor in 1947. The transistor was quaint by today's standards, a mechanism for handling on-off signals. Engineers built pathways through which voltage flowed, directing and controlling that voltage in response to certain inputs. We would punch holes into cards and feed them to a machine, running light through the holes into sensors. The cards became a medium, a set of instructions written in the language of yes and no.

In other words, it all started with two decisions, yes or no, one or zero. The more we could feed the machine, the more the decisions the machine could make. Eventually it seemed the number of decisions began to encroach on our own. The machine said yes or no so that we didn't have to. By the start of the social media era, we were the ones responding to these holes. Like or don't like. Swipe left or swipe right. It all began with that maze of circuitry. The first neural networks, our adding machines, the earliest computers, were designed to reveal information. Noise meant anything that crept into the circuits, and the history of computing is in part a history of noise reduction. The noise in our telephone wires and circuit boards, even our analog TV broadcasts, was background radiation. Energy pulsing invisibly in the air, lingering for millennia after the Big Bang exploded our universe into being. Our task was to remove any traces of it from our phone calls. Today millions of on off calculations can take place in a single second. Put enough of these signals together, run them fast enough and you can do remarkably complex things with remarkable speed. Much of that has been harnessed into lighting up pixels. Put enough pixels together and you get a digital image. You get video games. You get live streams. You get maps, interfaces and you collect and process responses to live streams, maps and interfaces.

With what we call generative AI today, we obviously aren't using punch cards. Now we inscribe our ones and zeros into digital images. The data mining corporations behind social media platforms take these digital images and they feed them to massive neural nets and data centres. In substance, the difference between punch cards and today's computation is only that our holes are smaller. Every image that we take becomes a computer program. Every caption and every label becomes a point of information. Today's generative AI models have learned from about 2.3 billion images with about 24 bits of information per pixel. All of them still at their core, a yes or no decision moving through a structure. I don't say this to give you a technical overview of image processing. I mention it because the entirety of human visual culture has a new name. We used to call these collections archives or museum holdings or libraries. Today we call them data sets. This collected culture has been harnessed to do the work of analog punch cards. And these cards, these physical objects, were once stamped with a warning. Do not fold, spindle or mutilate. Our collected visual heritage in its digital form carries no such warning.

We don't feed our visual culture into a machine by hand anymore, and the number of decisions that we have automated are so large that even the words are ridiculous. Teraflops. We upload images to the internet, pictures of our birthday parties, our weddings, embarrassing nights at the club (not so much me anymore). Our drawings, our paintings, these personal images meant to communicate with others are clumped together with other archives. Cultural institutions share a wealth of knowledge online for the sake of human education and the arts history and beyond. And in training an AI model, all of these images are diffused, a word that is so neatly parallel to this diffusion of unfiltered information that we surround ourselves with. And for once, it's a technology named in a way that describes what it actually does. Diffusion models actually diffuse! This word means what it says. It dissolves the images, it strips information away from them until they resemble nothing but the fuzzy chaos of in between television channels. Images are diffused into noise. Billions of good and bad images all diffused into noise for the sake of training an artificial intelligence system that will produce a billion more images. From noise into noise, we move from the noise of billions of images taken from our noisy data-driven visual culture, isolate them and dissolve them into the literal noise of an empty JPEG, to be recreated again into the noise of one more meaningless image generated by AI among the noise of billions of other images, a count of images that already overwhelms any one person's desire to look at them.

The information age has ended and we have entered the age of noise.

We often think of noise as a presence. In America, we call it snow, the static. I've heard of other things as well. It's called ants in Thailand. Other places have other metaphors. But snow is a presence. We see snow. We see noise. We hear noise. Noise from a communication engineering perspective is the absence of information. Sometimes that absence is the result of too much information, a slippery paradox. Information which cannot be meaningfully discerned is still noise. Information has been rushing at us for about two decades now, pushing out information in the frame of content to such an extent that almost no signal remains that is worth engaging with. Here's a map of the internet visualised 20 years ago. Since then, it has only grown, today becoming a disorienting flood of good and bad information coming through the same channels. And what we are calling generative AI is the end result of a successful information age, which in just 24 years has rewritten all cultural norms about surveillance, public sharing, and our trust in corporatised collections of deeply personal data. Server farms mined this data through regimes of surveillance and financialisation. The guiding principle of social media has always been to lure us into sharing more so that more data could be collected, sold, and analysed. They've calibrated the speed of that sharing to meet the time scales of data centres rather than human comprehension or our desire to communicate. And all this data has become the food for today's generative AI.

The words we shared built chat GPT, the images we shared built Stable Diffusion. Generative AI is just another word for surveillance capitalism. Taking our data with dubious consent and activating it through services it sells back to us. It is a visualisation of the way we organise things, a pretty picture version of the technologies that sorted and categorised us all along. Instead of social media feeds or bank loans or police lineups, these algorithms manifest as uncanny images, disorienting mirrors of the world rendered by a machine that has no experience of that world. If these images are unsettling because they resemble nothing like the lives they claim to represent, it's because that is precisely what automated surveillance was always doing to us. The internet was the Big Bang of the information era, and its noisy debris lingers within the Big Bang of generative AI. Famously, Open AI's chatbot stopped learning somewhere in April of 2021. That's when the bulk of its training was complete, and from there it was all just fine-tuning and calibration. Perhaps that marks the start of the age of noise, the age where streams of information blended into and overwhelmed one another in an indecipherable wall of static, so much information that truth and fiction dissolved into the same fuzz of background radiation.

I worry that the age of noise will mark the era where we turn to machines to mediate this media sphere on our behalf. It follows a simple logic. To manage artificial information, we turn to artificial intelligence. But I have some questions. What are the strategies of artificial intelligence? The information management strategies that are responsible for the current regime of AI can be reduced to two, abstraction and prediction. We collect endless data about the past, abstract it into loose categories and labels, and then we draw from that data to make predictions. We ask the AI to tell us what the future will look like, what the next image might look like, what the next text might read like. It's all based on these abstractions of the data about the past. This used to be the role of archivists. Archivists used to be the custodians of the past, and archives and curators, facing limited resources of space and time, often pruned what would be preserved. And this shaped the archives. The subjects of these archives adapt themselves to the spaces we make for them. Just as mold grows in the lightest part of a certain film, history is what survives the contours we make for it. We can't save everything. But what history do we lose based on the size of our shelves? These are a series of subjective, institutionalised decisions made by individuals within the context of their positions and biases and privileges and ignorances. The funding mandates, the space, and the time. (No offence!)

Humans never presided over a golden age of inclusivity, but at least the decisions were there on display. The archive provided its own evidence of its gaps. What was included was there, and what was excluded was absent. And those absences could be challenged. Humans could be confronted. Advocates could speak out. I'm reminded of my work with Wikipedia, simultaneously overwhelmed with biographies of men, but also host to a remarkable effort by volunteers to organise and produce biographies of women. When humans are in the loop, humans can intervene in the loop.

I'm often asked if I fear that AI will replace human creativity, and I don't remotely understand the question. Creativity is where agency rises, and as our agency is questioned, it is more important than ever to reclaim it, through creativity, not adaptability. Not contorting ourselves to machines, but agency — contorting the machines to us. I fear that we will automate our decisions and leave out variations of past patterns based on the false belief that only repetition is possible. Of course, my work is also a remix. It has a lineage. To Nam June Paik, who famously quipped, “I use technology in order to hate it properly.” And this is part of the tension, the contradictions that we're all grappling with. I'm trying to explore the world between archive and training data, between the meaningful acknowledgement of the past and the meaningless reanimation of the past through quantification. Archives are far more than just data points. We're using people's personal stories and difficult experiences for this. There's a beauty of lives lived and the horrors, too. Training images are more than data. There is more to our archives than the clusters of light-coloured pixels. Our symbols and words have meaning because of their context in collective memory. When we remove that, they lose their connection to culture. If we strip meaning from the archive, we have a meaningless archive. We have five billion pieces of information that lack real-world connections. Five billion points of noise. Rather than drifting into the mindset of data brokers, it is critical that we as artists, as curators, as policymakers approach the role of AI in the humanities from a position of the archivist, historian, humanitarian, and storyteller. That is, to resist the demand that we all become engineers and that all history is data science.

We need to see knowledge as a collective project, to push for more people to be involved, not less, to insist that meaning and context matters, and to preserve and contest those contexts in all their complexity. If artificial intelligence strips away context, human intelligence will find meaning. If AI plots patterns, humans must find stories. If AI reduces and isolates, humans must find ways to connect and to flourish. There is a trajectory for humanity that rests beyond technology. We are not asleep in the halls of the archive, dreaming of the past. Let's not place human agency into the dark, responsive corners. The challenge of this age of noise is to find and preserve meaning. The antidote to chaos is not enforcing more control. It's elevating context. Fill in the gaps and give the ghosts some peace.

Looking at the Machine | FACT24 Symposium
https://www.youtube.com/watch?v=Eqw7U8BA5aM

Kliknutím sem můžete změnit nastavení reklam

přezdívka
heslo

pamatuj si mě
registrace
ztracené heslo?