Enhance! AI Super Resolution Is Here!

  Переглядів 71,423

Two Minute Papers

Two Minute Papers

День тому

❤️ Check out Weights & Biases and sign up for a free demo here: wandb.me/papers
📝 The paper "Scaling Up to Excellence: Practicing Model Scaling for Photo-Realistic Image Restoration In the Wild" is available here:
supir.xpixel.group/
📝 My latest paper on simulations that look almost like reality is available for free here:
rdcu.be/cWPfD
Or this is the orig. Nature Physics link with clickable citations:
www.nature.com/articles/s4156...
🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:
Alex Balfanz, Alex Haro, B Shang, Benji Rabhan, Bret Brizzee, Gaston Ingaramo, Gordon Child, Jace O'Brien, John Le, Kyle Davis, Lukas Biewald, Martin, Michael Albrecht, Michael Tedder, Owen Skarpness, Richard Putra Iskandar, Richard Sundvall, Taras Bobrovytsky, Ted Johnson, Thomas Krcmar, Tybie Fitzhugh, Ueli Gallizzi.
If you wish to appear here or pick up other perks, click here: / twominutepapers
Thumbnail background design: Felícia Zsolnai-Fehér - felicia.hu
Károly Zsolnai-Fehér's research works: cg.tuwien.ac.at/~zsolnai/
Twitter: / twominutepapers

КОМЕНТАРІ: 299
@juhor.7594
@juhor.7594 2 місяці тому
The use of negative prompts if pretty clever. Sort of like using an adverserial network.
@MrJacobegg
@MrJacobegg 2 місяці тому
It's not really like using a GAN at all. Negative prompts are commonly used in statistical image generation techniques (like Stable Diffusion) to tell the algorithm what concepts you don't want your output to resemble. It's basically like telling the algorithm "not only am I asking for a picture of a car, but I also specifically do NOT want anything resembling a truck or SUV in any way."
@juhor.7594
@juhor.7594 2 місяці тому
​I get what you're getting at, though in this situation the negative prompts were used for training against failure cases. Kind of like GAN training.
@MrJacobegg
@MrJacobegg 2 місяці тому
@@juhor.7594 if that's the case, then I guess it could be more similar to GANs. My interpretation was they were giving the end user the ability to use both positive and negative prompts as guide rails to the enhancement. But I've only skimmed the paper very quickly and wasn't focused on at that aspect of it.
@bzikarius
@bzikarius 2 місяці тому
Reminds me WRONG LoRA for Stable diffusion. It has been also trained with bad images and enhance generation result. Also there is positive promts, that can be generated with modern taggers or king of CLIPvision and then corrected by human.
@bobclarke5913
@bobclarke5913 2 місяці тому
These should be called "Re-imaginings" rather than "Enhancement"
@user-xn2gr8me2u
@user-xn2gr8me2u 2 місяці тому
Agree. Since the AI in alorithm replace the original image, not enhancing it. What if the blurred input is an object that has never been photographed, such as an object in an art museum? Will the result like we hoping for?
@goodfortunetoyou
@goodfortunetoyou 2 місяці тому
Just click the Edit-in-post button!
@chosen_oNEO
@chosen_oNEO 2 місяці тому
I agree but damn that's almost a spot-on reimagination of the real features
@dmhzmxn
@dmhzmxn 2 місяці тому
The word enhancement doesn't require anything to be real. just enhanced... soo no. I'm going to use normal people words. thanks.
@rafaelhenrique-hp5bo
@rafaelhenrique-hp5bo 2 місяці тому
it's the same as that paper who could draw a super realistic picture with a poor quality sketch made by the user, only this time the sketch gave to the AI was way more detailed (compared to hand drawing stick figures)
@Mushbee
@Mushbee 2 місяці тому
This looks more like gen fill than upscaling to be honest. Just check out the car example, the manufacturer logo chaanged completely even tough there was enough information to at least rebuild into some similar shape.
@MikkoRantalainen
@MikkoRantalainen 2 місяці тому
I agree. I think the car in the blurry photo was actually Dacia Duster 2016 (or close to that) and the grill looked like something that belongs to Nissan instead. Even as is, it could be used to generate creative design solutions - just give heavily blurred original design as input and generate multiple responses.
@Lolkork
@Lolkork 2 місяці тому
Same with the Counter Strike example, the model tried to turn it in to a realistic image. Probably didn't train much on screenshots of old videogames I guess.
@NeovanGoth
@NeovanGoth 2 місяці тому
⁠​⁠@@MikkoRantalainenSince this algorithm allows an additional text prompt, one could use another AI that has more world knowledge (allowing it to correctly identify the cars brand and model) to feed it with additional information.
@MrGTAmodsgerman
@MrGTAmodsgerman 2 місяці тому
This isn't a bad thing really. Try MaginificAI or KreaAI upscaler/enhancer. You can control how creative the upscale is there. Write a prompt (which is the key here) and you can still upscale with previous regular upscaler and mix both in Photoshop to get rid of the bad parts. But the true value here is that you can give it a context by giving it a prompt. As this is what all the previous methods made wrong. It can actually "see" what's in the image. Two papers down the line and it should even get that's an Dacia Duster 2016 by maybe combining it with a ChatGPT type approach.
@user-xn2gr8me2u
@user-xn2gr8me2u 2 місяці тому
Wait.. this algorithm could be used as tool for misinformation/disinformation. What if input prompt intentionally made incorrect then people say the image was enhanced. Most people think enhanced image didnt change the original one.
@foolwise4703
@foolwise4703 2 місяці тому
This can become a real problem when it comes to truthfulness of data though. Imagine this being applied to surveillance cameras used to identify criminals. The ai is likely to extrapolate faces to common features rather than recognizable, true ones.
@85Pando
@85Pando 2 місяці тому
And investigators, police and so on will start believing them as "true images"... Things like these make me more afraid than optimistic for these technologies. The neural nets are imagining stuff based on their training data in a way matching the input.
@lawrencefrost9063
@lawrencefrost9063 2 місяці тому
I don't think so.
@Rastafa469
@Rastafa469 2 місяці тому
Using these techniques to extract more information out of images is ridiculous! You just can't create new information from an image that wasn't in there before. No pixel you add to the image can tell you anything about the original scene and this should be obvious. Whoever thinks this works is just stupid and has seen too much crime TV shows.
@ciragoettig1229
@ciragoettig1229 2 місяці тому
that's doesn't make any sense; if it demonstrably doesn't prove anything, why would we take it as evidence of anything in court? I could see some confusion about new tech initially though.
@SianaGearz
@SianaGearz 2 місяці тому
Actually there is an old technique from early 2000s that is suitable with surveillance footage, it was originally developed to recover high noise deep ocean video. It may have been the one that coined the word superresolution. Basically you align multiple images of the same subject to each other at higher resolution, then you can noise reject and deblur the resulting image. The outcome still needs manual validation though because occlusion/deocclusion can create artefacts. Obviously manipulated footage like that is usually not proof, surveillance just needs to archive footage that is as unmolested as sensible. Essentially this became DLSS2 and FSR2 today.
@fen4554
@fen4554 2 місяці тому
I love how the triangle badge turned into a Nissian lol. AI: Close enough.
@MikkoRantalainen
@MikkoRantalainen 2 місяці тому
Yeah, the original input was probably year 2016 Dacia Duster.
@Mr.MasterOfTheMonsters
@Mr.MasterOfTheMonsters 2 місяці тому
@@MikkoRantalainen More like 2019, I think. But it's a Dacia Duster for sure.
@Kuj
@Kuj 2 місяці тому
Honestly not knowing what the Dacia is My guess is was also a Nissan. Probably has the same training data set I have :P
@KellyNicholes
@KellyNicholes 2 місяці тому
Nissan slips them a $50
@Kuj
@Kuj 2 місяці тому
@@KellyNicholes I've never actually thought about that but brands slipping larger sets of their images into training data actually sounds like a great marketing tactic for them. Yikes. Welcom to the future of subliminal advertisement
@gaweyn
@gaweyn 2 місяці тому
what was missing: taking a high res photo (ground truth), creating a low res version that is processed by this model (SUPIR), finally comparing SUPIR to ground truth. Otherwise very promising paper.
@MrJacobegg
@MrJacobegg 2 місяці тому
It's not clear from your comment whether you know this already, but that's a big part of how these models are trained. Still, I agree that it's disappointing they included no discussion or visual examples in the paper comparing their results to the ground truth. Like... yes, your results look very good, but how do they compare qualitatively to three real thing?
@victorcadillogutierrez7282
@victorcadillogutierrez7282 2 місяці тому
@@MrJacobegg They do add the SSIM, LPIPS, PSNR metrics, that are used commonly to know the relation between the ground truth and infered result. It can be a little vague in the qualitative aspect and works better as a metric to benchmark between methods. There are few shots of ground-truth, input and predicted in the paper.
@MrJacobegg
@MrJacobegg 2 місяці тому
@@victorcadillogutierrez7282 they do, yes. But those are quantitative and I was specifically talking about qualitative comparison because I think that's important in a paper on image enhancement and super-resolution. An algorithm could get a great quantitative score because the trees, snow, sky, clouds, etc in a scene look very close to the ground truth. But then it changes things about the face of the subject of the photo in a way that it no longer looks like the same person at all. In fact, my experience has been that this seems to be a very common failure mode for these algorithms, at least when used in the wild.
@victorcadillogutierrez7282
@victorcadillogutierrez7282 2 місяці тому
Doing qualitative analysis on masive datasets is expensive, and probably the overrepresented labels of things like dogs , cats, whatever over faces is doing that, but also remember that high frequency details are harder to reconstruct because the downsampling process of the diffusion process leads to a loss of information and in some degree qualitative analysis is subjetive to the person doing the analysis, because you want faces or fingers to not fail, but someone else working with scenery would like to get better trees and pedestrian. So chosing a model on qualitative aspect is something the user has to do, because it's unpractical for benchmarking between foundational models or you might also need to finetune to your domain space.
@MrJacobegg
@MrJacobegg 2 місяці тому
@@victorcadillogutierrez7282 again... I wasn't talking about an exhaustive or costly qualitative analysis. The authors already provide multiple examples, in the paper and on their website, where they show a low resolution input and then show their upscaled output alongside the output from other algorithms for a qualitative comparison! It is NOT expensive, or even time consuming, to include the ground truth output alongside the upscaled outputs. That kind of comparison is even a kind of defacto standard for papers on upscaling and image enhancement. Granted, those examples can be, and usually are cherry-picked and should be taken with a pinch of salt. But at least it can give you an idea of some of the visual limitations - and good authors will also include a range of examples, including failure cases.
@jackmg
@jackmg 2 місяці тому
Agreed, the ai is making it up the best it can from reference it has so you are essentially looking a new render rather than enchanted image
@Razumen
@Razumen 2 місяці тому
Can't say this is something we'd want to use regularly from images on the net, considering how it basically does hallucinate details that aren't there.
@Darhan62
@Darhan62 2 місяці тому
You can't resolve information that's not actually present in the pixelated image. You *can* allow the model to make a reasonable guess and come up with something visually convincing. If there are several images in a series that the model can draw from for context, then presumably you can make the "upscaled image" more accurately represent what was originally there. But that probably requires another paper. The idea of using text prompts to help it understand what the missing information is is sort of a half-way measure to doing that.
@Onaterdem
@Onaterdem 2 місяці тому
Yes, using different images to resolve the same context would be much more accurate and a much more useful tool since we generally upscale videos and/or games, not still images
@MimicSilhouette
@MimicSilhouette 2 місяці тому
Another great video, keep it up! ❤
@scourgehh714
@scourgehh714 2 місяці тому
I can't wait u til this becomes open to thr public. Would be so helpful
@nassifsamuel55
@nassifsamuel55 2 місяці тому
Amazing, so many applications for this upscaling
@undergroundxp
@undergroundxp 2 місяці тому
WHAT A TIME TO BE ALIVE!
@lod4246
@lod4246 2 місяці тому
It is good day to be not dead!
@davicosthacripto6375
@davicosthacripto6375 2 місяці тому
@@lod4246 *pow* you are dead
@lobabobloblaw
@lobabobloblaw 2 місяці тому
It is indeed a fantastic model. Can’t wait for motion implementations.
@davidvincent380
@davidvincent380 2 місяці тому
Training models with bad samples associated with negative prompts is such a good idea, you wonder why it hasn't been done before. And that method can apply to all kind of generative AIs
@brianhauk8136
@brianhauk8136 2 місяці тому
Holy mother of papers! I agree and look forward to seeing appropriate use of this technology preceded by societal discussion to define what's appropriate and what is not.
@jonbrooks8232
@jonbrooks8232 2 місяці тому
New two minute paper! Thank you for keeping us informed Dr Károly Zsolnai-Fehér!
@LouisGedo
@LouisGedo 2 місяці тому
Wow! Astonishing!
@thechanotv8202
@thechanotv8202 2 місяці тому
4:35 Holy Mother of Papers! 🤯
@Cordis2Die
@Cordis2Die 2 місяці тому
😂😂😂
@OneCharmingQuark
@OneCharmingQuark 2 місяці тому
Computer, enhance!
@bobclarke5913
@bobclarke5913 2 місяці тому
Computer, hallucinate! ;)
@geraldhewes
@geraldhewes 2 місяці тому
Simply amazing
@iuristasiv9360
@iuristasiv9360 2 місяці тому
Impressive!
@fiddley
@fiddley 2 місяці тому
Love to see some of the old Movies through this to see what they might look like if done today.
@holahandstrom
@holahandstrom 2 місяці тому
Looking forward to see it used on Space telescopes images and in phones. ... or those always blurry UFO images ;D ... I might be used as a pre-warp signature - "Can they see us ? Nop. We'll come back later." Seriously, great progress! I guess video is the next step.
@felicityadj5886
@felicityadj5886 2 місяці тому
6:12 Superstar on the left..mass
@mariokotlar303
@mariokotlar303 2 місяці тому
You should have mentioned that this works by using SDXL (Stable Diffusion XL). I cite the paper: Specifically, SUPIR employs StableDiffusion-XL (SDXL) [63] as a powerful generative prior, which contains 2.6 billion parameters. To effectively apply this model, we design and train a adaptor with more than 600 million parameters. The only thing holding us back from using SDXL like this ourselves to upscale to very high resolutions without server grade GPUs is the lack of ControlNet tile model for it.
@percywhitehead9228
@percywhitehead9228 2 місяці тому
I don't understand why you'd imply this AI up scaling could possibly 'guess' the number plate. It has like 4 blurry lines. The number plate doesn't even properly overlap the original image.
@eafindme
@eafindme 2 місяці тому
I considered it as hallucinations because entropy is entropy, if information is lost, it lost forever. So whatever the works that try to refill the missing details are just an attempt to guess the information but not the real thing. To make matter worse, we chose to believe it.
@AMA14700
@AMA14700 2 місяці тому
What a time to be alive
@tamlynburleigh9267
@tamlynburleigh9267 2 місяці тому
Could be really helpful in astronomy.
@feynstein1004
@feynstein1004 2 місяці тому
I guess enhancing photos in movies wasn't unrealistic, it was just futuristic 😀
@NeovanGoth
@NeovanGoth 2 місяці тому
The reference test for any "enhance" algorithm should be how well it is able to upscale old Star Trek episodes from SD to 4K.
@the-secrettutorials
@the-secrettutorials 2 місяці тому
Did the competitor models also have the whole zoomed out image available as context?
@paulalexandrupop3709
@paulalexandrupop3709 2 місяці тому
2:14 - managed to turn a Dacia Spring into a Nissan 🪄
@MikkoRantalainen
@MikkoRantalainen 2 місяці тому
I would guess Dacia Duster 2018 instead, but definitely not a Nissan.
@reckless_programmer
@reckless_programmer 2 місяці тому
4:37 "Holy mother of papers!"
@arothmanmusic
@arothmanmusic 2 місяці тому
I wonder how soon this will be available in an accessible tool like Upscayl?
@landwirtschaft2116
@landwirtschaft2116 2 місяці тому
Remember how in the 90s crime tv-shows claimed that this is possible? Having bad surveillance footage and then the detective/investigator character standing behind the 'computer specialist' character and the dialog goes like: "can you enhance it?" - "okay, applying extrapolation" or some bullshit haha…
@Harrock
@Harrock 2 місяці тому
Does it work with old Tape Footage? Like from old Formula 1 races from the 1970s-2000s ?
@trickybarrel444
@trickybarrel444 2 місяці тому
Very promising but the models are not available yet.
@GetterGo
@GetterGo 2 місяці тому
I’m surprised that the car brand badge went so wrong. I would have thought that car badges would have a pretty strong dataset by now.
@celozzip
@celozzip 2 місяці тому
why are we not seeing an original hd picture, turned into a low res picture, then upscaled with the a.i. so we can compare the a.i. upscale to the hd?
@somethingaboutmirrors
@somethingaboutmirrors 2 місяці тому
Agreed. Wouldn’t that be the ultimate benchmark?
@TheStabbedGaiusJuliusCaesar
@TheStabbedGaiusJuliusCaesar 2 місяці тому
I look forward for when I can just drag and drop an old video file into a program, hit start, and end up with a fully upscaled video with crisp lines.
@vectoralphaAI
@vectoralphaAI 2 місяці тому
my 80s and 90s SD anime collections would be prestine.
@TheStabbedGaiusJuliusCaesar
@TheStabbedGaiusJuliusCaesar 2 місяці тому
@@vectoralphaAI - My old VHS rips of old cartoons from the same and older periods would be too.
@MrGTAmodsgerman
@MrGTAmodsgerman 2 місяці тому
Finally the KreaAI/MaginificAI type upscaler as open source. Hopefully it can be run on regular hardware.
@HDfoodie
@HDfoodie 2 місяці тому
What is the best publicly available model for accurately upscaling video?
@ExtantFrodo2
@ExtantFrodo2 2 місяці тому
Up-scaling video might actually be easier since there is more information in the source.
@LeeBrenton
@LeeBrenton 2 місяці тому
Temporal coherence? (could i use this technique for video upscaling? - would that weird monkey man character's face be stable over time - or would the upscaled clear version flickr all around the place?)
@DrD0000M
@DrD0000M 2 місяці тому
The monkey man is from "Journey to the West" (1986 TV series) BTW.
@kevincrady2831
@kevincrady2831 2 місяці тому
Fantastic! Now we can start analyzing all those blurry UFO photographs. 😂
@gatsby66
@gatsby66 2 місяці тому
🤣
@gatsby66
@gatsby66 2 місяці тому
And those blurry images of fake monsters on land and in sea. 😂
@posterblue
@posterblue 2 місяці тому
Enhancing pixelated photos should have a disclaimer since other derivatives could be created, none of which may be identical to the original. Where 'scaling up' images will prove to be game changing is when it can be applied to pixelated video. Using all of the sequential images to depixelate a video until there is only one 'true' solution, that being the original video prior to pixelation, will probably throw a lot of people in shock!!
@bzikarius
@bzikarius 2 місяці тому
At one hand it is great, that ai not only restores pixels, but tries to restore surfaces and structures according to it`s knowledge. On the other hand we can see, that ai is not intellectual enough to suppose regular structures like stone pattern or black and yello parallel stripes. So perhaps this ai need help of another one like SAM to separate blocks and try to suppose their features. Anyway such models will do great help to designers, because retouche is very time consuming task.
@Uthael_Kileanea
@Uthael_Kileanea 2 місяці тому
2:05 - Forget the license plate! The logo changed shape drastically!
@GET_YOUTUBE_VIEWS_m021
@GET_YOUTUBE_VIEWS_m021 2 місяці тому
I love your videos!
@danielemammoli5177
@danielemammoli5177 Місяць тому
in future they will think that CSI scenes with "zoom in... zoom in... enhance... zoom... enhance.. there! he's our guy!" were actually real and ahead of their times
@Kavriel
@Kavriel 2 місяці тому
Did you not do a video on "magnific" AI ? That's super interesting, with high res results.
@mayorc
@mayorc 2 місяці тому
This seems to use SDXL and LLAVA plus an algorithm to upscale and increase quality, considering how it works it could be embedded in Stable Diffusion WebUI or ComfyUI.
@hoodhommie9951
@hoodhommie9951 2 місяці тому
@1:50 Finally we get the technology that detectives have been using in movies... "ENHANCE!"
@Yourname942
@Yourname942 2 місяці тому
I wonder if/when Streaming services/UKposts will start using this (or if they don't want to, so they continue to make you pay for higher quality versions)
@omanussus
@omanussus 2 місяці тому
4:02 I think that more people nowadays have better access to the fast internet than to the powerful phone
@alantaylor2694
@alantaylor2694 2 місяці тому
I wonder if you can use it to get license plate numbers from bad dashcam footage in hit and run cases. Might have more than one frame to go off of as well.
@NorbertKasko
@NorbertKasko 2 місяці тому
2:05 You can clearly see that it's not the same car with the same emblem. I think originally it's a Dacia and it's made it to be a Nissan.
@FunnyVidsIllustrated
@FunnyVidsIllustrated 2 місяці тому
ENHANCE!
@gfdggdfgdgf
@gfdggdfgdgf 2 місяці тому
This is not to be used to enhance faces on security footage as the output will be a good face but at best it will be a face that's similar to the original person when downscaled.
@wilburdemitel8468
@wilburdemitel8468 2 місяці тому
it'd be great if the feds start believing this shit tho
@julinaut
@julinaut 2 місяці тому
just imagine when we get this for videos with good temporal coherence... oh boy
@MikkoRantalainen
@MikkoRantalainen 2 місяці тому
I agree. If this actually worked for videos with good temporal coherence, it would sell new GPUs like hot cakes.
@Siderite
@Siderite 2 місяці тому
VR -> AR -> BR (Bent Reality). Soon to come in the next paper: NR! (No Reality)
@RefractArt
@RefractArt 2 місяці тому
Never in my life i would've imagined to see a Dacia Duster on Dr.Zslonei's channel 🤣
@BlueHound
@BlueHound 2 місяці тому
This one is going to need a few more papers.
@hjups
@hjups 2 місяці тому
You seemed to gloss over the fact that this method uses SDXL. It will neither run in real-time nor run on mobile devices as suggested. You also have to contend with how SDXL scales poorly to larger resolutions, restricting this method to smaller output resolutions than a GAN based SSR. Perhaps a smaller model with a similar method could work, but there's no guarantee that it would achieve these types of results. Additionally, using this method to decompress images on the web would probably be a bad idea since it will "hallucinate" detail that may change the meaning of the image, which would be amplified by the random nature of LDMs.
@perfectionbox
@perfectionbox 2 місяці тому
Yep, Blind Lake is almost here
@kipchickensout
@kipchickensout 2 місяці тому
Do you know if SUPIR will also be runnable offline? I'm kinda fond of stuff that can run offline cause I don't want all pictures to be processed on the internet, for example if it's pictures of myself or whatever but sadly a good bunch of the best stuff is often online or behind a paywall
@phepheboi
@phepheboi Місяць тому
Yes you can, I think for several weeks now. But it's heavy. Even on my 4090, it takes some time and it uses a lot of resources. I think a 2k texture to 4k takes up to 5-10min
@kipchickensout
@kipchickensout Місяць тому
@@phepheboi Oh damn alright, yeah I'm running "only" a 3070
@vectoralphaAI
@vectoralphaAI 2 місяці тому
Imagine ALL the low resolution 240p, 480p and lower movies, tv shows and videos from the past fully restored to basically 4K with this AI. That would be incredible. I would love to watch SD shows, and cartoons and anime from back in the day with this clarity or old 1940s and 50s black and white movies in 4K and colorized. The near future will be amazing.
@jamiethomas4079
@jamiethomas4079 2 місяці тому
Shouldn’t this be even easier for some cartoons since they are sorta like cell shading or can more easily be translated to vector based images. Like windwaker can be run at whatever resolution you can compute.
@SireBab
@SireBab 2 місяці тому
My concern is that it's imagining a lot of details. So you may get a document with squiggles that suddenly has some nonsense written on it in a show.
@chiragvhora9995
@chiragvhora9995 2 місяці тому
2:24 logo changed to round shape from V shape - ai knows what's gonna be there but not accurate about what's there .
@OrionBlitz256
@OrionBlitz256 2 місяці тому
Can't wait to play 90s games with new graphics without having to do a remake.
@Siderite
@Siderite 2 місяці тому
Funny though, that is figured out the license plate, but it completely changed the brand of the car. That is a Dacia and it replaced it with something Opel/Nissan
@nangld
@nangld 2 місяці тому
So when we can watch the Charlie Chaplin movies in 4k?
@DrD0000M
@DrD0000M 2 місяці тому
Now. Look up Chaplin 4K, there are color 60fps 4K videos. Loses the charm a bit, I think though.
@chenjus
@chenjus 2 місяці тому
What paper is that at 2:48?
@MOSMASTERING
@MOSMASTERING 2 місяці тому
Question - if this could run in real time (ie. 30-60fps) and you have an old video game (DOOM for example) being upscaled - are the upscalings of the textures done in a semi-random way that would give an individual frame a vastly improved pixel resolution that looks fine in isolation, but when checking frame by frame - it would appear that there was too much difference between them for a smooth video? Does that make sense? The process is guesswork, rather than a precise upscale method. Photos or frames of video look great - but the process is done in such a way that an animation would look corrupt or parts of the image would appear to wobble or jump around because the process is randomly generative, not interpolating based on a fixed idea? I guess this is so hard to describe because many of the words needed to describe what's happening don't exist because these new AI tools are creating things that have never been done before! Also, you can absolutely NOT use upscaled images in scientific applications or in a crime situation, like upscaling CCTV etc. Because it's generating NEW data that definitely didn't exist before. You can't use it to look for stars or galaxies in upscaled space photos, or find a person based on an image upscale because it's pure guesswork?
@jimmysyar889
@jimmysyar889 2 місяці тому
wonder if SORA combined with this somehwo could do that
@The_Questionaut
@The_Questionaut 2 місяці тому
​@@jimmysyar889I bet we'll see something soon if not in a few years
@gatsby66
@gatsby66 2 місяці тому
I was thinking of Doom, too. So fun, but so pixelated.
@tmgrassi
@tmgrassi 2 місяці тому
I think the term you were looking for was "temporal coherence". Is that right? Cheers!
@simonstoiljkovikj
@simonstoiljkovikj Місяць тому
2:20 from Dacia to Nissan :D
@AK-vx4dy
@AK-vx4dy 2 місяці тому
It becames scarry, really fast
@calvingrondahl1011
@calvingrondahl1011 2 місяці тому
If it looks good it is good.
@MonkeySimius
@MonkeySimius 2 місяці тому
It is neat that it is creating a realistic output. One thing to keep in mind is that the output doesn't really seems to be really contradictory to what the input was. A lot of the details were clearly not the details in the input. Really, to demonstrate this we'd need to see the original then the pixilated input the AI is given then the output the AI created. I'm sure there would be tons of errors comparing the output directly with non pixilated input. Tl;Dr this isn't the "enhance enhance enhance" trope promised to is by TV shows.
@jere473
@jere473 2 місяці тому
I'm fairly certain true enhance is simply not possible because the information is not there so it has to at best make a guess of what should be there.
@UninstallingWindows
@UninstallingWindows 2 місяці тому
I remember this in the CSI movies (Enhance....enhance more....enhance ) and saying ... thats not how image enhancing works. Looks like i might live long enough to see it become reality :D
@skmgeek
@skmgeek 2 місяці тому
it's not really 'enhancing' the image though, more 'guessing the most likely pixels'
@ulamss5
@ulamss5 2 місяці тому
Feels very much like current img2img with controlnet + segment anything regional/regional prompting
@timmygilbert4102
@timmygilbert4102 2 місяці тому
Previous technique weren't able to recognize it was person... Neither did I 😂
@gatsby66
@gatsby66 2 місяці тому
I see at least one practical application: Changing the looks and ages of the FBI's Most Wanted and missing people. The missing people section often shows corpses, which creeps me out. I'd rather AI imagine what they looked like alive.
@philtrem
@philtrem 2 місяці тому
It's still going to lack temporal coherence if used as is on video I suppose. But the results are amazing.
@yessopie
@yessopie 2 місяці тому
It's scary to think of all of those detective shows where they "zoom in and enhance" to catch the criminal. Now the technology actually exists, but it simply hallucinates the details it's putting in. But are police detectives really going to understand that? These are the same people who think they can tell who is lying from their facial expressions, and who think that lie detectors actually work.
@TheRedRanger123
@TheRedRanger123 2 місяці тому
The upscaled images were part of the original training data, right? Because otherwise I don't know how the AI could imagine the skiing womans face including correct reflections on the visor. So the AI "remembers" something. If these were images the AI has never been trained on, I don't think the results would be that good.
@ioulios12
@ioulios12 2 місяці тому
I hope this model to be integrated to the waifu2x caffe
@YOEL_44
@YOEL_44 2 місяці тому
It went from a Dacia Duster to a Nissan Dusty
@Dimencia
@Dimencia 2 місяці тому
What's the difference between this and something like Dall-E or Stable Diffusion? I mean, low resolution seems effectively the same thing as noise, and if it accepts text prompts, how is it different?
@60FpsGoodness
@60FpsGoodness 2 місяці тому
Its not. This method is using stable diffusion. You can do the same thing localy on your computer.
@Diastolicflame
@Diastolicflame 2 місяці тому
Nvidia already has this, theyve had supwr resolution for videos in real time for around a year now. Its more subtle than this though
@SireBab
@SireBab 2 місяці тому
This feels kind of odd, hasn't this tech been a thing with dlss, or Intel and amd's proprietary offering? What distinguishes dlss 2 (not frame generation) from this?
@lawrencefrost9063
@lawrencefrost9063 2 місяці тому
This kind of tech should be default in phones in 3 years. Like you don't need an app, just every time you take a picture it makes it super clean if you opt in.
@evilutionltd
@evilutionltd 2 місяці тому
Definitely gen fill. It's using a similar thing to reverse google image search where it compares the image, finds similar images of a higher resolution and uses the data to generate the photo. Amazing though. If they released it as paid software on the Mac, I'd probably buy it.
@mikegriffin1639
@mikegriffin1639 2 місяці тому
I think the first few seconds may be missing.
@user-if1ly5sn5f
@user-if1ly5sn5f 2 місяці тому
old kung fu movies and other ones will be way better now. I kinda wanna clean one up and give it to my dad.
@CamAlert2
@CamAlert2 2 місяці тому
This upscaling tech would not work in situations where fetching precision information is crucial. Look at the example of the license plating on the vehicle. Otherwise, it would be fantastic alternative to something like the JPEG format.
@am_ma
@am_ma Місяць тому
How fast are they?
@m2mdohkun
@m2mdohkun 2 місяці тому
Everything has it pros and cons. Now how to create/innovate ways to detect misuse of this technology? And of course the next paper will bypass such detection. And we get back to making detector for detection. And another paper will... Ah! We've come full circle
@NetTubeUser
@NetTubeUser 2 місяці тому
The problem is that nobody can install and test it because the installation instructions are incorrect and the version of Python 3.8 for the environment is the wrong one, as there are major conflicts, especially with the triton version. Plus, there's no explanation of the models and files we need to download. And there's still no change or answer to this question. For a week now. It's really strange that some people can't provide the right information. I really don't understand that.
@hulkemedia3543
@hulkemedia3543 2 місяці тому
Uuuuuh that eye
@TheRealNickG
@TheRealNickG 2 місяці тому
2:33 Unreal Engine Nanites have left the chat...
DeepMind’s New Robots: An AI Revolution!
8:39
Two Minute Papers
Переглядів 197 тис.
Unreal Engine 5.4: Game Changer!
5:57
Two Minute Papers
Переглядів 106 тис.
Voloshyn - ЗУСИЛЛЯ (прем'єра треку 2024)
06:17
VOLOSHYN
Переглядів 638 тис.
1,000,000,000 Parameter Super Resolution AI!
4:55
Two Minute Papers
Переглядів 140 тис.
DeepMind AlphaFold 3 - This Will Change Everything!
9:47
Two Minute Papers
Переглядів 129 тис.
ComfyUI - Learn how to generate better images with Ollama | JarvisLabs
9:26
Generative AI in a Nutshell - how to survive and thrive in the age of AI
17:57
The Most INSANE AI Image Upscaler, EVER!
13:42
PiXimperfect
Переглядів 312 тис.
AI Video Upscaling: Super-Resolution Group Test
17:58
ExplainingComputers
Переглядів 54 тис.
Google’s New AI Watched 30,000,000 Videos!
7:52
Two Minute Papers
Переглядів 73 тис.
NVIDIA’s New AI Is 20x Faster…But How?
8:16
Two Minute Papers
Переглядів 125 тис.
Magnific AI Upscaler Free Alternatives! Krea and Comfy UI Workflows
10:21
All Your Tech AI
Переглядів 38 тис.
How Neuralink Works 🧠
0:28
Zack D. Films
Переглядів 25 млн
Vortex Cannon vs Drone
20:44
Mark Rober
Переглядів 13 млн