Low Light to High Light Image (Night Eye)

Ismail Siddiqui
6 min readSep 26, 2021
Low Light to High Light

Contents :

  1. Overview.
  2. The Data and Proprocessing.
  3. Loss Function and Metrics.
  4. Loading Data.
  5. Utility Callback.
  6. Model Architecture.
  7. Results.
  8. Sample Test Output.
  9. Code Link
  10. Future Work.
  11. References.

1 — Overview: Low-light image enhancement is one of the most challenging tasks in computer vision, and it is actively researched and used to solve various problems. Most of the time, image processing achieves significant performance under normal lighting conditions. However, under low-light conditions, an image turns out to be noisy and dark, which makes subsequent computer vision tasks difficult. To make buried details more visible, and reduce blur and noise in a low-light captured image, a low-light image enhancement task is necessary. A lot of research has been applied to many different techniques. However, most of these approaches require much effort or expensive equipment to perform low-light image enhancement.

2— The Data and Processing: We have used Sony RAW Image dataset, consists of 2928 RAW images with different low exposures and high exposures pairs. Each image resolution is of 4256x2848 and extension of .ARW.

This RAW images are very different from our .JPG or any other regular images extensions types, images that we often work with.

These RAW images are camera sensor’s data that contains many information like exposure values, ISO, shutter speed, focal length, etc. Explaination to what these informations are beyond scope of this blog, right now all you can think of these informations are that tells us how bright, color values, textures, etc. the camera sensors have captured.

To process these raw data of sensors some bulit-in smartphone/camera alogorithms perfoms some black-leveling, normalizing color values, etc. and give use the .JPG or any other common image type.

To load our .ARW image file we will use RAWPY library as skimage or any other standard image loading libraries doesn’t support RAW images.

RAW to RGB

The above code snippet takes RAW image process it and convert it to RGB image just like our smartphone/camera does.

For this experiment we will not work with RGB images instead we will work with RAW images.

To do this we will load the RAW image which are single channel image also known as Bayer Raw. The image pattern is half green, one quarter red and one quarter blue, hence is also called BGGR, means these are 4 channel images which are packed in single image.

To unpack this Bayer Raw to four channel image will use the pattern of the Bayer RAW to extract each channel, this technique also known as Bayer Filter.

Bayer Filter
Unpack Raw

The above code snippet will take a RAW image of size (2848, 4256) and unpack the Bayer Raw to four channel of size (1424, 2128, 4), as we have noticed the size of the image is halfed and channel is increased to 4.

Now we take Short Exposures images and unpack it and we take Long Exposure images, that will be our input and output respectively.

But there are some points that we have to keep in mind while loading and processing these images:

Step-1: We don’t resize these images instead we take random patches of the images of any size that fits in the GPU memory, as for this experiment we have taken 256x256 size of patches. WHY? Because the image size of input is (1424,2128,4 ) which will never fit in the GPU memory, so we make patches of the images instead of resizing it. As the input is RAW sensor data resizing the image may lead to loss of some vital information, so we don’t resize it.

Patch Extractor
Sample Image Patch

Step-2: Loading and processing these images takes a long time, so its not a wise choice to load and process the images everytime at each training steps. So, we use a simple hack i.e. , we load and process the images we will get n-dimensional array, then we dump/save each image in the SSD/HDD.

Single Image Dumper

Step-3: To future reduce the loading time will dump/save a patched images(array) in batches that will be used for training, for this experiment we took batch size of 4. Ofcourse, step 2, 3 will might take some minutes but will greatly reduce the data loading time while training, which can help to experiment and hypertune the network.

Batch Dump

Link to Dataset

3— Loss Function and Metrics: To get best performance and optimal losses we have variety of losses to work with, for now we will only one loss function and metric:

  • Mean Absolute Error (MAE): The mean absolute error of a model with respect to a test set is the mean of the absolute values of the individual prediction errors on over all instances in the test set. Mathematically:
Mean Absolute Error
  • PSNR: It is an engineering term for the ratio between the maximum possible power of a signal and the power of corrupting noise that affects the fidelity of its representation. Because many signals have a very wide dynamic range. The statistician adopted this technique to evaluate the noise in the image. High value of PSNR indicates better filtering of the image.
PSNR

4— Loading Data: To work with images first we need to load the images, to do so we will do it using the custom Image Data Generator:

  • Data Generator: To make our work easy we can use predefined functions from TensorFlow/Keras, we just give the datframe constaining image path. The following function will take random sample, load the arrays, randomly multiply the amplification values (100, 250, 300) to input image and augument it.
Data Generator

5 — Utility Callback: To visualize our training process we will display the input, ground truth and prediction image, using custom callback.

Predict Callback

6—Model Architecture: We have defined a custom U-Net model architecture with multiple improvement and tuning.

Model Architecture

In the above proposed model, we have taken kernel size as 3 and stride of 1. Additionally, we are concatenating feature maps to avoid vanishing gradient and dropout to avoid overfitting. In the end of the network we yield feature maps of (None, 256, 256, 12), to get the target size of (512, 512, 3) we use Depth to space or Sub Pixel layer.

  • Sub Pixel: It is a specific implementation of a deconvolution layer that can be interpreted as a standard convolution in low-resolution space followed by a periodic shuffling operation. Sub-pixel convolution has the advantage over standard resize convolutions that, at the same computational complexity, it has more parameters and thus better modelling power.
Sub Pixel

7— Results: By training the model for 200 epochs we got the pretty good results —

Loss Vs Epochs
Epoch Vs PSNR

8 — Sample Test Output: Here are some sample outputs on test data —

Sample Ouput

9 — CodeLink: GitHub

10 — Future Work:

  • To further increase the performance of the models we can experiment with different model types.
  • Train with more epcohs like 1000, to improve the performance.
  • To make our model more robust we can combine RAW image data from multiple sensor images.

11— Reference:

--

--

Ismail Siddiqui

Machine Learning Engineer at AppyHigh. I have phenomenal problem solving and Machine Learning skill. Seeking to do an impossible task that no one can’t do.