Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I think they're probably right about the AI-sharpening using specific knowledge about the moon... However, they are wrong about the detail being gone in the gaussian-blurred image.

If they applied a perfect digital gaussian-blur, then that is reversible (except at the edges of the image, which are black in this case anyway). You still lose some detail due to rounding errors, but not nearly as much as you might expect.

A gaussian blur (and several other kinds of blur) are a convolution of the image with a specific blur function. A convolution is equivalent to simply multiplying pointwise the two functions in frequency space. As long as you know the blur function exactly, you can divide the final image by the gaussian function in frequency space and get the original image back (modulo rounding errors).

It is not totally inconceivable that the AI model could have learned to do this deconvolution with the Gaussian blur function, in order to recover more detail from the image.



Author tested for this by doing the experiment again with detail clipped into highlights, completely gone, model detail was added back.

> To further drive home my point, I blurred the moon even further and clipped the highlights, which means the area which is above 216 in brightness gets clipped to pure white - there's no detail there, just a white blob - https://imgur.com/9XMgt06

> I zoomed in on the monitor showing that image and, guess what, again you see slapped on detail, even in the parts I explicitly clipped (made completely 100% white): https://imgur.com/9kichAp


While I think this is a great test, I'm not really sure what that second picture is supposed to be showing. Kinda seems like they used the wrong picture entirely.


Second image is a video, shows them zooming in and how it switches from the blob to detail


Ah! Thank you! I wasn't getting the controls for some reason.

Given how small the pure-white areas are, tbh I'm not sure I'd consider that as having "added detail". It has texture that matches the rest of the moon, but that's about as far as I'd be comfortable claiming... and that seems fine, basically an un-blurring artifact like you see in tons of sharpening algorithms.

I do think this "clip the data, look for impossible details" is a very good experiment and one that seems likely to bear fruit, since it's something cameras "expect" to encounter. I just don't think this instance is all that convincing.

---

And to be clear, I absolutely believe Samsung is faking it, and hiding behind marketing jargon. The outputs are pretty ridiculous. They may not be "photoshopping on a texture", but training an AI on moon pictures and asking it to add those details to images is... well, the same thing Photoshop has features for. It makes no difference - it's not maximizing the data available, it's injecting external data, and they deserve to be slammed for that.


I watched the video and in this case the "recovered" detail is clearly natural to me. The original case does look like some kind of moon-specific processing, but this one with clipped highlights seems natural and can be achieved using classical CV.


What? Clipped means gone - the pixel is FFFFFF - how can CV look at a FFFFFF pixel, surrounded by FFFFFF pixels, and get out a moon-looking pixel?


Because the nearby pixels are not clipped.


> As long as you know the blur function exactly, you can divide the final image by the gaussian function in frequency space and get the original image back (modulo rounding errors).

Those rounding errors are very important though. The Gaussian function goes to zero very quickly and dividing by small numbers is not a good idea.

If your deconvolving a noise free version of the original that also doesn't have any saturated pixels (in the black or white direction) then you can get the pretty close to the original back. I don't think this applies here because the OP is taking a picture of a screen that shows the blurred version, so we've got all kind of error sources. I think the OP is right: the camera is subbing in a known picture of the moon.

It would be interesting to see what happens with anisotropic blur for example, or with a picture of the moon with added fake details (words maybe?) and then blurred.


> However, they are wrong about the detail being gone in the gaussian-blurred image.

Well yes, but he also downsampled the image to 170x170. As far as I know, downsampled information is strictly lost, and unrecoverable without an external information source (like an AI model trained with pictures of moon).


I'm too lazy to downscale it myself, so here's a 180x180 picture of the moon from WP [1]. This looks about the same as the Samsung result [2]. They are not getting the original detail, but they are getting the detail they should expect if Samsung simply deconvolved the blurred image.

[1] https://upload.wikimedia.org/wikipedia/commons/thumb/2/2b/Lu...

[2] https://imgur.com/bXJOZgI


>If they applied a perfect digital gaussian-blur, then that is reversible

Not true. Deconvolution is a statistical estimate. Think about it. When you blur, colors get combined with their neighbors. Statistically this moves toward a middle grey. You're compressing the gamut of colors towards the middle, and thus losing information. Look at an extreme case - 2 pixels of mid-grey. It can be deconvoluted to itself, to a light and dark grey, or to one black and one white. All those deconvolutions are equally valid. There's no 1-to-1 inverse to a convolution. If you do a gaussian blur on a real photo and then a deconvolution algorithm you'll get a different image, with an arbitrary tuning, but probably biased towards max contrast in details and light noise, since that what people expect from such tools and what most real photos have. But, just like A.I. enhanced images, it's using statistics when filling in the missing data.



Wow, that is so cool, and such a good writeup. I like the analogy to an encrypted file, and the key being the exact convolution. The amount of information lost is the amount of information in the key.

I wonder if there is some algorithmic way to find the key and tell if it's correct - some dictionary attack, or some loss function that knows if it's close. Perhaps such a thing only works on images that are similar to a training set. It wouldn't work on black and white random dots, since there'd be no way for a computer to grade or know statistics for which deconvolution looks right.


But the AI should not have learned to apply a Gaussian deconvolution kernel. If anything it should be applying a lens-based bokeh kernel instead. A true lens blur does not behave like a Gaussian blur.


They don't get an exact reconstruction of the original image. What happens if you apply Gaussian blur and then try to undo it with a bokeh kernel?


A mess.


A mess like in the OP https://imgur.com/ULVX933 or something worse?

(Just in case, original image https://imgur.com/PIAjVKp )


You certainly wouldn't be able to recover any detail this way (in fact, deconvoluting Gauss after taking a photo of the picture displayed on a computer screen won't give you any kind of sensible results either - try it yourself).


While the information might be recoverable, the information is not seen by the camera sensor. Hence I think the argument in the post stands. Some AI model/overlay magic is happening, pretending to display information the sensor simply did not receive.


You're forgetting it was also downscaled to 170x170, and later had the highlights clipped. Both are irreversible.


This is incorrect. The frequency domain inverse of the Gaussian ends up yielding a division by zero. There is no inverse for the Gaussian.


That is mathematically true but not practically. Though indeed the Gaussian kernel has lots of zeros [1], in actuality, (a) the zeros themselves are at points, not regions, and therefore of little consequence, and (b) in practice the noise generated from reamplifying frequencies near these zeros can be minimized via techniques such as Wiener deconvolution [2].

[1] https://en.wikipedia.org/wiki/Window_function#Gaussian_windo...

[2] https://en.wikipedia.org/wiki/Wiener_deconvolution


They didn't claim invertible. The de-gaussinization is a reversible process albeit not invertible. I actually say more in this comment

https://news.ycombinator.com/item?id=35111998


No amount of math is going to save the original detail from getting downsampled to 170x170


A major problem with blur beyond rounding errors, say due to the optics being somewhat blurry due to manufacturing difficulties and tradeoffs for weakening assembly tolerance requirements (like wanting rotationally symmetrical optical surfaces, despite a rectangular shaped actively-used image focal plane (e.g. CMOS photodiode array), and potential for specializing the design to evenly light up _just_ that rectangle), is that the photon shot noise has a standard deviation equal to the square root of the photon count.

A smartphone sensor pixel has space for some low 4 digits number of electrons (created with some probability from photons, but that stochastic effect doesn't matter for anything a normal user would photograph) and typically should have a fixed 2~10 electron standard deviation from the analog-to-digital-converter (well, mostly the amplifiers involved in that process).

So if your pixel is fully exposed at a high 10000 electrons, and you √ that, you have 100 electrons stddev from shot noise plus worst case 10 electrons stddev from the readout amplifier/ADC. If you have a dark pixel that only got 100x less light to only have accumulated 100 electrons, √ of that gives 10 electrons stddev of shot noise plus the same 10 electrons stddev readout amplifier/ADC.

The problem is that while you have an SNR of 5 with the dark pixel, when trying to deconvolve it out of a nearby bright pixel, even perfectly with no rounding errors (1 electron = 1 ulp/lsb in a linear raw format), you now have 100/110 = 10/11 ≈ 0.91. That's far worse than the 5 from before. This gets worse if your ADC has only the 2 electrons stddev instead of the 10 (about 2x worse here).

That's the reason why deconvolution after the photon detector is a band aid that you only begrudgingly tend to accept.

The trade-off just requires massively increased aperture/light gathering, likely negating your savings on optics.


> If they applied a perfect digital gaussian-blur, then that is reversible

Actually any noise distribution is frequently reversible if you know the parameters and number of steps. This is in fact how diffusion models work (there's even work of Normalizing Flows removing realistic camera noise). It is just almost impossible to figure this out since there are many equivalent looking ways. But we need to be clear that there is a difference between reversibility and invertibility. A invertible process is bijective, or perfectly recreates the original setting. A reversible process can just work in both directions and isn't guaranteed to be invertible. (Invertible means reversible but reversible doesn't mean invertible)[0]

I bring this up because even more complicated versions of bluring could be argued as not "faked" but rather "enhanced." A better way to test Samsung faking the data is to mask out regions. If the phone fills in the gaps then it is definitely generating new data. This can still be fine if the regions are small, unless we also want to call bilinear interpolation "faked" but I don't think most people would. This is why it gets quite difficult to actually prove Samsung is faking the image. I don't have a Samsung phone to test this though.

So basically I'm with you, and even a slightly stronger version of this

> It is not totally inconceivable that the AI model could have learned to do this deconvolution with the Gaussian blur function, in order to recover more detail from the image.

Edit: After reading other comments I wanted to bring some things up.

- The down scaling is reversible, but not invertible. We can upscale, reversing the process. But yes, there is information lost. But some data can still be approximated and/or inferred.

- The clipping experiment isn't that good. Honestly, looking at the two my brain fills in the pieces and they look reasonable to me too. Clipping the brightness isn't enough, especially since it is a small portion of the actual distribution. I did this on both the full image and small image and both are difficult to distinguish by eye from the non-clipped. Clipping below 200 seems to better wash out the bottom of the moon and remove that detail. 180 seems better though tbh.


The level of BS in this thread perfectly resembles the BS in religious-level audiophile discussions. A mixture of provably correct and provably incorrect statements all mixed together with common words used in uncommon ways.

> But yes, there is information lost. But some data can still be approximated and/or inferred.

The perfect summary.


Not all information is equally important though. Most people can't tell flac from a high quality lossy compression. If you cut off everything above 19khz and below 70hz you've lost information but not important information. The same analogy holds true about imagery. Which information is lost is important which is why I discuss clipping different levels and masking to make a stronger case. I'm just saying I don't think we can conclude an answer from this experiment, not that their claim is wrong. I want to be clear about that.


So Sydney is an audiophile? Got it!


I would be interested to see what the best possible deconvolution of the blurred image looks like, if anyone has the setup and knowledge to try it?


Gaussian blur is essentially acting as a low pass filter. An IR filter does not strictly destroy information in the filtered spectrum components, but does attenuate their power.

Given a perfect blurred image, reconstruction is possible - however due to the attenuation, these high frequency components are ~sensitive~.

Apart from quantisation effects [you mentioned which limits perfect de-convolution], adding a little AW Gaussian noise(such as taking a photo of the image from across the room) after the kernel is applied obliterates high frequency features.

Recovery when noise is low (plus known glyphs) is why you should not use Gaussian blur followed by print screen to redact documents. Inability to recover when there are artifacts and noise is [part of] why cameras cannot just set a fixed focus [at whatever distance] and deconvolve with the aperture [estimated width at each pixel] to deblur everything that was out of focus.

TLDR for readers, It is unlikely to recover sufficient detail via de-convolution here.


Is it Gaussian blur, though, or some other invertible kernel?


I wondered that too. I think he should have altered the moon image a bit before applying the filter.


Author did, did it again intentionally CLIPPING detail so there was none, not just blurred gone. It put detail.


I seemed to have read the whole post but didn't notice anything about it, my bad.


This is wrong. The blurred image contains only intensity information, but reversing the convolution in frequency space would require phase information as well. A simple Gaussian blur is not reversible, even in principle.


There is no "phase information" in the spatial domain. "Phase" is literally, where the pixels are on the screen.

Rather, reversing blur of any type is limited (a) by spatial decimation (a.k.a. down sampling, which is performed in the article), and (b) by noise/quantization floor, below which high frequency content has been pushed.


The input of the DFT is real, but the output is complex. Filtering in the Fourier domain means that the DFT of the image is multiplied with that of the filter. The resulting complex array is than converted into an output image by taking the magnitude. This destroys the phase information and makes the operation irreversible.

Another point of view is that there are infinitely many images that will produce the same result after blurring. Obviously, this makes the operation irreversible.


> The resulting complex array is than converted into an output image by taking the magnitude.

No, it's emphatically not. Perhaps you are thinking of displaying a spectrogram.

To produce an image from frequency-domain data, inverse DFT must be applied. Since (as @nyanpasu64 points out), the DFT of a real-valued image or kernel is conjugate-symmetric (and vice-versa), the result is again real-valued without loss of information. The phase information is not lost. If it were, the image would be a jumbled mess.

(Not that DFT+inverse DFT is necessary for Gaussian blur anyway -- you simply convolve with a truncated Gaussian kernel.)

> Another point of view is that there are infinitely many images that will produce the same result after blurring.

No, this is not true. I don't know why you think it is. This is only true of a brick wall filter, which Gaussian filter is not [1].

The SNR of high-spatial-frequency components is reduced for sure, which can lead to irrevocable information loss. But this is nothing to do with phase.

[1] https://en.wikipedia.org/wiki/Window_function#Gaussian_windo...


Both an image and a convolution have a conjugate-symmetric DFT, and the phase of each complex bin encodes spatial information, and there's no "taking the magnitude" involved anywhere when turning a spectrum back into an image (only complex cancellation).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: