[15:02 Tue,2.May 2023 by Rudi Schmidts] |
The company Stability AI (which among other things also significantly promotes the open source Stable-Diffusion) has introduced another image generator with DeepFloyd/IF. It is said to be particularly suitable for fonts and graphics.
![]() Stable Diffusions language seems to be out of this world. But this problem shall be over now, because the new DeepFloyd/IF model shall allow photorealistic representations with lettering. It is also said to be particularly suitable for graphic tasks such as logo design. DeepFloyd is based on Google&s AI image generator Imagen. This works somewhat differently than Stable Diffusion and combines an open source large language model (LLM) from Google ( T5-XXL-1.1) with a pixel diffusion model. The latter works in three stages and primarily generates only 64 x 64 pixel images, which are then scaled up twice via superresolution over 256 x 256 pixels to the output resolution of 1024 x 1024 pixels. The image generator was trained with the proven LAION-A data set with 1.2 billion images. ![]() DeepFloyd/IF can generate readable text and graphics An official web image generator to try out DeepFloyd/IF online does not exist yet - because the current license only allows the use for research and not for commercial purposes. However, if you want to "research" it yourself, you can find ![]() At the same time, however, DeepFloyd/IF also heralds a new era for AI home use. Because while previous Stable Diffusion models already work with graphics cards from about 6 GB memory, DeepFloyd now requires at least 16 GB GPU memory. For the higher-quality (and thus larger) model, even 24 GB are mandatory. Such strongly increasing requirements for GPU memory in the upcoming AI applications ![]() ![]() deutsche Version dieser Seite: Schluss mit Kauderwelsch - neue Bild-KI DeepFloyd / IF kann auch schreiben |
![]() |