renom_img.api.utility

renom_img.api.utility.box

rescale ( box , before_size , after_size )

Rescale box coordinates and size to specific size.

Parameters:
  • box ( list ) – This list has 4 elements that represent above coordinates.
  • before_size ( float ) – Size of the box before rescaling.
  • after_size ( float ) – Size of the box before rescaling.
transform2xywh ( box )

This function changes box’s coordinate format from (x1, y1, x2, y2) to (x, y, w, h).

( x1 , y1 ) represents the coordinate of upper left corner. ( x2 , y2 ) represents the coordinate of lower right corner.

( x , y ) represents the center of bounding box. ( w , h ) represents the width and height of bonding box.

The format of argument box have to be following example.

[x1(float), y1(float), x2(float), y2(float)]
Parameters: box ( list ) – This list has 4 elements that represent above coordinates.
Returns: Returns reformatted bounding box.
Return type: (list)
transform2xy12 ( box )

This function changes box’s coordinate format from (x, y, w, h) to (x1, y1, x2, y2).

( x , y ) represents the center of bonding box. ( w , h ) represents the width and height of bonding box.

( x1 , y1 ) represents the coordinate of upper left corner. ( x2 , y2 ) represents the coordinate of lower right corner.

The format of argument box have to be following example.

[x(float), y(float), w(float), h(float)]
Parameters: box ( list ) – This list has 4 elements that represent above coordinates.
Returns: Returns reformatted bounding box.
Return type: (list)
calc_iou_xyxy ( box1 , box2 )

This function calculates IOU in the coordinate format (x, y, w, h).

( x , y ) represents the coordinate of the center. ( w , h ) represents the width and height.

The format of argument box have to be following example.

[x(float), y(float), w(float), h(float)]
Parameters:
  • box1 ( list ) – List of a box. The list has 4 elements that represent above coordinates.
  • box2 ( list ) – List of a box. The list has 4 elements that represent above coordinates.
Returns:

Returns value of IOU.

Return type:

(float)

calc_iou_xywh ( box1 , box2 )

This function calculates IOU in the coordinate format (x1, y1, x2, y2).

( x1 , y1 ) represents the coordinate of upper left corner. ( x2 , y2 ) represents the coordinate of lower right corner.

The format of argument box have to be following example.

[x1(float), y1(float), x2(float), y2(float)]
Parameters:
  • box1 ( list ) – List of a box. The list has 4 elements that represent above coordinates.
  • box2 ( list ) – List of a box. The list has 4 elements that represent above coordinates.
Returns:

Returns value of IOU.

Return type:

(float)

renom_img.api.utility.load

parse_xml_detection ( xml_path_list , num_thread=8 )

XML format must be Pascal VOC format.

Parameters:
  • xml_path_list ( list ) – List of xml-file’s path.
  • num_thread ( int ) – Number of thread for parsing xml files.
Returns:

This returns list of annotations. Each annotation has a list of dictionary which includes keys ‘box’ and ‘name’. The structure is bellow.

Return type:

(list)

# An example of returned list.
[
    [ # Objects of 1st image.
        {'box': [x(float), y, w, h], 'name': class_name(string), 'class': id(int)},
        {'box': [x(float), y, w, h], 'name': class_name(string), 'class': id(int)},
        ...
    ],
    [ # Objects of 2nd image.
        {'box': [x(float), y, w, h], 'name': class_name(string), 'class': id(int)},
        {'box': [x(float), y, w, h], 'name': class_name(string), 'class': id(int)},
        ...
    ]
]

renom_img.api.utility.nms

nms ( )

Non-Maximum Suppression

Parameters:
  • preds ( list ) – A list of predicted boxes. The format is as follows.
  • threshold ( float , optional ) – Defaults to 0.5 . This represents the ratio of overlap between two boxes.
Example of the argument “preds”.
[
    [ # Objects of 1st image.
        {'box': [x(float), y, w, h], 'class': class_id(int), 'score': score},
        {'box': [x(float), y, w, h], 'class': class_id(int), 'score': score},
        ...
    ],
    [ # Objects of 2nd image.
        {'box': [x(float), y, w, h], 'class': class_id(int), 'score': score},
        {'box': [x(float), y, w, h], 'class': class_id(int), 'score': score},
        ...
    ]
]
Returns: Returns reformatted bounding box.
Return type: (list)
Example of return value.
[
    [ # Objects of 1st image.
        {'box': [x(float), y, w, h], 'class': class_id(int), 'score': score},
        {'box': [x(float), y, w, h], 'class': class_id(int), 'score': score},
        ...
    ],
    [ # Objects of 2nd image.
        {'box': [x(float), y, w, h], 'class': class_id(int), 'score': score},
        {'box': [x(float), y, w, h], 'class': class_id(int), 'score': score},
        ...
    ]
]
soft_nms ( )

Soft Non-Maximum Suppression

Parameters:
  • preds ( list ) – A list of predicted boxes. The format is as follows.
  • threshold ( float , optional ) – Defaults to 0.5 . This represents the ratio of overlap between two boxes.
Example of the argument, “preds”.
    [
        [ # Objects of 1st image.
            {'box': [x(float), y, w, h], 'class': class_id(int), 'score': score},
            {'box': [x(float), y, w, h], 'class': class_id(int), 'score': score},
            ...
        ],
        [ # Objects of 2nd image.
            {'box': [x(float), y, w, h], 'class': class_id(int), 'score': score},
            {'box': [x(float), y, w, h], 'class': class_id(int), 'score': score},
            ...
        ]
    ]
Returns: Returns reformatted bounding box.
Return type: (list)
Example of the output.
    [
        [ # Objects of 1st image.
            {'box': [x(float), y, w, h], 'class': class_id(int), 'score': score},
            {'box': [x(float), y, w, h], 'class': class_id(int), 'score': score},
            ...
        ],
        [ # Objects of 2nd image.
            {'box': [x(float), y, w, h], 'class': class_id(int), 'score': score},
            {'box': [x(float), y, w, h], 'class': class_id(int), 'score': score},
            ...
        ]
    ]

References

Navaneeth Bodla, Bharat Singh, Rama Chellappa, Larry S. Davis,
Soft-NMS – Improving Object Detection With One Line of Code

renom_img.api.utility.target

class DataBuilderClassification ( class_map , imsize )

Bases: renom_img.api.utility.target.DataBuilderBase

Data builder for a classification task

Parameters:
  • class_map ( array ) – Array of class names
  • imsize ( int or tuple ) – Input image size
build ( img_path_list , annotation_list , augmentation=None , **kwargs )

Builds an array of images and corresponding labels

Parameters:
  • img_path_list ( list ) – List of input image paths.
  • annotation_list ( list ) – List of class id [1, 4, 6 (int)]
  • augmentation ( Augmentation ) – Instance of the augmentation class.
Returns:

Batch of images and corresponding one hot labels for each image in a batch

Return type:

(tuple)

class DataBuilderDetection ( class_map , imsize )

Bases: renom_img.api.utility.target.DataBuilderBase

Data builder for a detection task

Parameters:
  • class_map ( array ) – Array of class names
  • imsize ( int or tuple ) – Input image size
build ( img_path_list , annotation_list , augmentation=None , **kwargs )
Parameters:
  • img_path_list ( list ) – List of input image paths.
  • annotation_list ( list ) – The format of annotation list is as follows.
  • augmentation ( Augmentation ) – Instance of the augmentation class.
Returns:

Batch of images and ndarray whose shape is (# images, maximum number of objects in an image * (4(coordinates) + 1(confidence)))

Return type:

(tuple)

resize_img ( img_list , annotation_list )
class DataBuilderSegmentation ( class_map , imsize )

Bases: renom_img.api.utility.target.DataBuilderBase

Data builder for a semantic segmentation task

Parameters:
  • class_map ( array ) – Array of class names
  • imsize ( int or tuple ) – Input image size
build ( img_path_list , annotation_list , augmentation=None , **kwargs )
Parameters:
  • img_path_list ( list ) – List of input image paths.
  • annotation_list ( list ) – The format of annotation list is as follows.
  • augmentation ( Augmentation ) – Instance of the augmentation class.
Returns:

Batch of images and ndarray whose shape is (batch size, #classes, width, height)

Return type:

(tuple)

crop_to_square ( image )
load_annotation ( path )

Loads annotation data

Parameters: path – A path of annotation file
Returns: Returns annotation data(numpy.array), the ratio of the given width to the actual image width,
Return type: (tuple)
load_img ( path )

Loads an image

Parameters: path ( str ) – A path of an image
Returns:
Returns image(numpy.array), the ratio of the given width to the actual image width,
and the ratio of the given height to the actual image height
Return type: (tuple)
resize ( img_list , label_list )

renom_img.api.utility.augmentation

class Augmentation ( process_list )

Bases: object

This class is for applying augmentation to images.
An Augmentation instance is passed to the ImageDistributor module,
and is called only when training process is runnning.
You could choose augmentation methods from the process module.
Parameters: process_list ( list of Process modules ) – list of process modules. The following example selects Shift, Rotate, Flip and WhiteNoise.

Example

>>> from renom_img.api.utility.augmentation import Augmentation
>>> from renom_img.api.utility.augmentation.process import Flip, Shift, Rotate, WhiteNoise
>>> from renom_img.api.utility.distributor.distributor import ImageDistributor
>>> aug = Augmentation([
...     Shift(40, 40),
...     Rotate(),
...     Flip(),
...     WhiteNoise()
... ])
>>> distributor = ImageDistributor(
...     img_path_list,
...     label_list,
...     builder,
...     aug,
...     num_worker
... )
transform ( x , y=None , mode='classification' )

This function is for applying augmentation to ImageDistributor

Parameters:
  • x ( list of str ) – List of image paths.
  • y ( list of annotations ) – List of annotations for x.
Returns:

list of transformed images and list of annotations for x.

[
    x (list of numpy.ndarray), # List of transformed images.
    y (list of annotations) # List of annotations for x.
]

Return type:

tuple

flip ( x , y=None , mode='classification' )

Flip image randomly.

Parameters:
  • x ( list of str ) – List of image paths.
  • y ( list of annotation ) – List of annotations for x.
Returns:

list of transformed images and list of annotations for x.

Return type:

tuple

[
    x (list of numpy.ndarray), # List of transformed images.
    y (list of annotation) # List of annotations for x.
]

Examples

>>> from renom_img.api.utility.augmentation.process import flip
>>> from PIL import Image
>>> import numpy as np
>>>
>>> img1 = Image.open(img1_path).convert('RGB')
>>> img1 = np.asarray(img1).transpose(2,0,1).astype(np.float32)
>>>
>>> img2 = Image.open(img2_path).convert('RGB')
>>> img2 = np.asarray(img2).transpose(2,0,1).astype(np.float32)
>>>
>>> img_list = [img1, img2]
>>> flipped_img = flip(img_list)
horizontalflip ( x , y=None , prob=True , mode='classification' )

Flip image randomly, only about vertical axis.

Parameters:
  • x ( list of str ) – List of image paths.
  • y ( list of annotations ) – List of annotations for x.
Returns:

list of transformed images and list of annotations for x.

Return type:

tuple

[
    x (list of numpy.ndarray), # List of transformed images.
    y (list of annotation) # List of annotations for x.
]

Examples

>>> from renom_img.api.utility.augmentation.process import horizontalflip
>>> from PIL import Image
>>> import numpy as np
>>>
>>> img1 = Image.open(img1_path).convert('RGB')
>>> img1 = np.asarray(img1).transpose(2,0,1).astype(np.float32)
>>>
>>> img2 = Image.open(img2_path).convert('RGB')
>>> img2 = np.asarray(img2).transpose(2,0,1).astype(np.float32)
>>>
>>> img_list = [img1, img2]
>>> flipped_img = horizontalflip(img_list)
verticalflip ( x , y=None , prob=True , mode='classification' )

Flip image randomly, only about horizontal axis.

Parameters:
  • x ( list of str ) – List of image paths.
  • y ( list of annotations ) – List of annotations for x.
Returns:

list of transformed images and list of annotations for x.

Return type:

tuple

[
    x (list of numpy.ndarray), # List of transformed images.
    y (list of annotations) # List of annotations for x.
]

Examples

>>> from renom_img.api.utility.augmentation.process import verticalflip
>>> from PIL import Image
>>> import numpy as np
>>>
>>> img1 = Image.open(img1_path).convert('RGB')
>>> img1 = np.asarray(img1).transpose(2,0,1).astype(np.float32)
>>>
>>> img2 = Image.open(img2_path).convert('RGB')
>>> img2 = np.asarray(img2).transpose(2,0,1).astype(np.float32)
>>>
>>> img_list = [img1, img2]
>>> flipped_img = verticalflip(img_list)
random_crop ( x , y=None , padding=4 , mode='classification' )

Crop image randomly.

Parameters:
  • x ( list of str ) – List of image paths.
  • y ( list of annotations ) – List of annotations for x.
Returns:

list of transformed images and list of annotations for x.

Return type:

tuple

[
    x (list of numpy.ndarray), # List of transformed images.
    y (list of annotations) # List of annotations for x.
]

Examples

>>> from renom_img.api.utility.augmentation.process import random_crop
>>> from PIL import Image
>>> import numpy as np
>>>
>>> img1 = Image.open(img1_path).convert('RGB')
>>> img1 = np.asarray(img1).transpose(2,0,1).astype(np.float32)
>>>
>>> img2 = Image.open(img2_path).convert('RGB')
>>> img2 = np.asarray(img2).transpose(2,0,1).astype(np.float32)
>>>
>>> img_list = [img1, img2]
>>> cropped_img = random_crop(img_list)
center_crop ( x , y , mode='classification' )

Crop image in the center.

Parameters:
  • x ( list of str ) – List of image paths.
  • y ( list of annotations ) – List of annotations for x.
Returns:

list of transformed images and list of annotations for x.

Return type:

tuple

[
    x (list of numpy.ndarray), # List of transformed images.
    y (list of annotations) # List of annotations for x.
]

Examples

>>> from renom_img.api.utility.augmentation.process import center_crop
>>> from PIL import Image
>>> import numpy as np
>>>
>>> img1 = Image.open(img1_path).convert('RGB')
>>> img1 = np.asarray(img1).transpose(2,0,1).astype(np.float32)
>>>
>>> img2 = Image.open(img2_path).convert('RGB')
>>> img2 = np.asarray(img2).transpose(2,0,1).astype(np.float32)
>>>
>>> img_list = [img1, img2]
>>> cropped_img = center_crop(img_list)
shift ( x , y=None , horizontal=10 , vertivcal=10 , mode='classification' )

Shift images randomly according to given parameters.

Parameters:
  • x ( list of str ) – List of image paths.
  • y ( list of annotations ) – List of annotations for x.
Returns:

list of transformed images and list of annotations for x.

Return type:

tuple

[
    x (list of numpy.ndarray), # List of transformed images.
    y (list of annotations) # List of annotations for x.
]

Examples

>>> from renom_img.api.utility.augmentation.process import shift
>>> from PIL import Image
>>> import numpy as np
>>>
>>> img1 = Image.open(img1_path).convert('RGB')
>>> img1 = np.asarray(img1).transpose(2,0,1).astype(np.float32)
>>>
>>> img2 = Image.open(img2_path).convert('RGB')
>>> img2 = np.asarray(img2).transpose(2,0,1).astype(np.float32)
>>>
>>> img_list = [img1, img2]
>>> shifted_img = shift(img_list)
rotate ( x , y=None , mode='classification' )

Rotate images randomly by 0, 90, 180, or 270 degrees.

Parameters:
  • x ( list of str ) – List of image paths.
  • y ( list of annotations ) – List of annotations for x.
Returns:

list of transformed images and list of annotations for x.

Return type:

tuple

[
    x (list of numpy.ndarray), # List of transformed images.
    y (list of annotations) # List of annotations for x.
]

Examples

>>> from renom_img.api.utility.augmentation.process import rotate
>>> from PIL import Image
>>> import numpy as np
>>>
>>> img1 = Image.open(img1_path).convert('RGB')
>>> img1 = np.asarray(img1).transpose(2,0,1).astype(np.float32)
>>>
>>> img2 = Image.open(img2_path).convert('RGB')
>>> img2 = np.asarray(img2).transpose(2,0,1).astype(np.float32)
>>>
>>> img_list = [img1, img2]
>>> rotated_img = rotate(img_list)
white_noise ( x , y=None , std=0.01 , mode='classification' )

Add white noise to images.

Parameters:
  • x ( list of str ) – List of image paths.
  • y ( list of annotations ) – List of annotations for x.
Returns:

list of transformed images and list of annotations for x.

[
    x (list of numpy.ndarray), # List of transformed images.
    y (list of annotations) # list of annotations for x.
]

Return type:

tuple

Examples

>>> from renom_img.api.utility.augmentation.process import white_noise
>>> from PIL import Image
>>> import numpy as np
>>>
>>> img1 = Image.open(img1_path).convert('RGB')
>>> img1 = np.asarray(img1).transpose(2,0,1).astype(np.float32)
>>>
>>> img2 = Image.open(img2_path).convert('RGB')
>>> img2 = np.asarray(img2).transpose(2,0,1).astype(np.float32)
>>>
>>> img_list = [img1, img2]
>>> white_noise_img = white_noise(img_list)
distortion ( x , y , mode='classification' )

Randomly distort image contents while maintaining image shape.

Parameters:
  • x ( list of str ) – List of image paths.
  • y ( list of annotations ) – List of annotations for x.
Returns:

list of transformed images and list of annotations for x.

[
    x (list of numpy.ndarray), # List of transformed images.
    y (list of annotations) # list of annotations for x.
]

Return type:

tuple

Examples

>>> from renom_img.api.utility.augmentation.process import distortion
>>> from PIL import Image
>>> import numpy as np
>>>
>>> img1 = Image.open(img1_path).convert('RGB')
>>> img1 = np.asarray(img1).transpose(2,0,1).astype(np.float32)
>>>
>>> img2 = Image.open(img2_path).convert('RGB')
>>> img2 = np.asarray(img2).transpose(2,0,1).astype(np.float32)
>>>
>>> img_list = [img1, img2]
>>> distorted_img = distortion(img_list)
color_jitter ( x , y=None , h=0.1 , s=0.1 , v=0.1 , mode='classification' )

Color Jitter. Performs random scaling on Hue, Saturation and Brightness values in HSV space.

Parameters:
  • x ( list of str ) – List of image paths.
  • y ( list of annotations ) – List of annotations for x.
  • h ( float , or tuple/list of two floats ) – Scaling factor for hue values in HSV space. If a list [a, b] or tuple (a,b), the h values are scaled by a randomly sampled factor from uniform distribution in the range[a, b]. If a float, the h values are scaled from uniform distribution in the range [1-h, 1+h].
  • s ( float , or tuple/list of two floats ) – Scaling factor for saturation values in HSV space. If a list [a, b] or tuple (a,b), the s values are scaled by a randomly sampled factor from uniform distribution in the range[a, b]. If a float, the s values are scaled from uniform distribution in the range [1-s, 1+s].
  • v ( float , or tuple/list of two floats ) – Scaling factor for brightness value in HSV space. If a list [a, b] or tuple (a,b), the v values are scaled by a randomly sampled factor from uniform distribution in the range[a, b]. If a float, the v values are scaled from uniform distribution in the range [1-v, 1+v].
Returns:

list of transformed images and list of annotations for x.

Return type:

tuple

[
    x (list of numpy.ndarray), # List of transformed images.
    y (list of annotations) # list of annotations for x.
]

Example

>>> from renom_img.api.utility.augmentation.process import color_jitter
>>> from PIL import Image
>>> import numpy as np
>>>
>>> img = Image.open(img_path)
>>> img.convert('RGB')
>>> img = np.array(img).transpose(2, 0, 1).astype(np.float)
>>> x = np.array([img])
>>> new_x, new_y = color_jitter(x, h=0.1, s=0.1, v=0.2)
contrast_norm ( x , y=None , alpha=0.5 , per_channel=False , mode='classification' )

Contrast Normalization

Parameters:
  • x ( list of str ) – List of image paths.
  • y ( list of annotations ) – List of annotations for x.
  • alpha ( float or list of two floats ) – Higher value increases contrast, and lower value decreases contrast. If a list [a, b] is provided, alpha value is sampled from uniform distribution ranging from [a, b). If a float is provided, alpha is set to that value as a constant.
  • per_channel ( Bool ) – Whether to apply contrast normalization for each channel. If alpha is given a list, then different values for each channel are used.
Returns:

list of transformed images and list of annotations for x.

Return type:

tuple

[
    x (list of numpy.ndarray), # List of transformed images.
    y (list of annotations) # List of annotations for x.
]

Example

>>> from renom_img.api.utility.augmentation.process import contrast_norm
>>> from PIL import Image
>>> import numpy as np
>>>
>>> img = Image.open(img_path)
>>> img.convert('RGB')
>>> img = np.array(img).transpose(2, 0, 1).astype(np.float)
>>> x = np.array([img])
>>> new_x, new_y = contrast_norm(x, alpha=0.4)
random_brightness ( x , y=None , delta=32 , mode='classification' )

Random Brightness Adjustment

Parameters:
  • x ( list of str ) – List of image paths.
  • y ( list of annotations ) – List of annotations for x.
  • delta ( int ) – Range of values (-delta to +delta) for randomly fluctuating pixel values.
Returns:

list of transformed images and list of annotations for x.

Return type:

tuple

[
    x (list of numpy.ndarray), # List of transformed images.
    y (list of annotations) # List of annotations for x.
]

Example

>>> from renom_img.api.utility.augmentation.process import random_brightness
>>> from PIL import Image
>>> import numpy as np
>>>
>>> img = Image.open(img_path)
>>> img.convert('RGB')
>>> img = np.array(img).transpose(2, 0, 1).astype(np.float)
>>> x = np.array([img])
>>> new_x, new_y = random_brightness(x, delta=16)
random_hue ( x , y=None , mode='classification' )

Random Hue Adjustment

Parameters:
  • x ( list of str ) – List of image paths.
  • y ( list of annotations ) – List of annotations for x.
  • max_delta ( float ) – Maximum hue fluctuation parameter. Must be in range [0, 0.5].
Returns:

list of transformed images and list of annotations for x.

Return type:

tuple

[
    x (list of numpy.ndarray), # List of transformed images.
    y (list of annotations) # List of annotations for x.
]

Example

>>> from renom_img.api.utility.augmentation.process import random_hue
>>> from PIL import Image
>>> import numpy as np
>>>
>>> img = Image.open(img_path)
>>> img.convert('RGB')
>>> img = np.array(img).transpose(2, 0, 1).astype(np.float)
>>> x = np.array([img])
>>> new_x, new_y = random_hue(x, max_delta=0.2)
random_saturation ( x , y=None , mode='classification' )

Random Saturation Adjustment

Parameters:
  • x ( list of str ) – List of image paths.
  • y ( list of annotations ) – List of annotations for x.
  • ratio ( float ) – Saturation fluctuation parameter. Must be in range [0, 1].
Returns:

list of transformed images and list of annotations for x.

Return type:

tuple

[
    x (list of numpy.ndarray), # List of transformed images.
    y (list of annotations) # List of annotations for x.
]

Example

>>> from renom_img.api.utility.augmentation.process import random_saturation
>>> from PIL import Image
>>> import numpy as np
>>>
>>> img = Image.open(img_path)
>>> img.convert('RGB')
>>> img = np.array(img).transpose(2, 0, 1).astype(np.float)
>>> x = np.array([img])
>>> new_x, new_y = random_saturation(x, ratio=0.2)
random_lighting ( x , y=None , mode='classification' )

Random Lighting Adjustment

Parameters:
  • x ( list of str ) – List of image paths.
  • y ( list of annotations ) – List of annotations for x.
Returns:

list of transformed images and list of annotations for x.

Return type:

tuple

[
    x (list of numpy.ndarray), # List of transformed images.
    y (list of annotations) # List of annotations for x.
]

Example

>>> from renom_img.api.utility.augmentation.process import random_lighting
>>> from PIL import Image
>>> import numpy as np
>>>
>>> img = Image.open(img_path)
>>> img.convert('RGB')
>>> img = np.array(img).transpose(2, 0, 1).astype(np.float)
>>> x = np.array([img])
>>> new_x, new_y = random_lighting(x)
random_expand ( x , y=None , mode='classification' )

Randomly expand images

Parameters:
  • x ( list of str ) – List of image paths.
  • y ( list of annotations ) – List of annotations for x.
Returns:

list of transformed images and list of annotations for x.

Return type:

tuple

[
    x (list of numpy.ndarray), # List of transformed images.
    y (list of annotations) # List of annotations for x.
]

Example

>>> from renom_img.api.utility.augmentation.process import random_expand
>>> from PIL import Image
>>> import numpy as np
>>>
>>> img = Image.open(img_path)
>>> img.convert('RGB')
>>> img = np.array(img).transpose(2, 0, 1).astype(np.float)
>>> x = np.array([img])
>>> new_x, new_y = random_expand(x)
shear ( x , y=None , mode='classification' )

Randomly shear image

Parameters:
  • x ( list of str ) – List of image paths.
  • y ( list of annotations ) – List of annotations for x.
  • max_shear_factor ( int ) – Angle range for randomly shearing image contents.
Returns:

list of transformed images and list of annotations for x.

Return type:

tuple

[
    x (list of numpy.ndarray), # List of transformed images.
    y (list of annotations) # List of annotations for x.
]

Example

>>> from renom_img.api.utility.augmentation.process import shear
>>> from PIL import Image
>>> import numpy as np
>>>
>>> img = Image.open(img_path)
>>> img.convert('RGB')
>>> img = np.array(img).transpose(2, 0, 1).astype(np.float)
>>> x = np.array([img])
>>> new_x, new_y = shear(x, max_shear_factor=8)

renom_img.api.utility.evaluate

class EvaluatorClassification ( prediction , target )

Bases: renom_img.api.utility.evaluate.EvaluatorBase

Evaluator for classification tasks

Parameters:
  • prediction ( list ) – A list of predicted class
  • target ( list ) – A list of target class. The format is as follows
Example of the arguments, “prediction” and “target”.
    [
        class_id1(int),
        class_id2(int),
        class_id3(int),
    ]

Example

>>> evaluator = EvaluatorClassification(prediction, target)
>>> evaluator.precision()
>>> evaluator.recall()
accuracy ( )

Returns accuracy.

Returns: Accuracy
Return type: (float)
f1 ( )

Returns f1 for each class and mean f1 score.

Returns: 2 values are returned. Each element represents a dictionary of F1 score for each class and mean F1 score(float). The format is as follows.
Return type: (tuple)
Example of outputs.
    ({
        class_id1(int): f1 score(float),
        class_id2(int): f1_score(float)
    }, mean_f1_score(float))
precision ( )

Returns precision for each class and mean precision

Returns: 2 values are returned. Each element represents a dictioanry of precision for each class and the mean precision(float). The format is as follows.
Return type: (tuple)
Example of outputs.
    ({
        class_id1(int): precision(float),
        class_id2(int): precision(float),
    }, mean_precision(float))
recall ( )

Returns recall for each class and mean recall

Returns: 2 values are returned. Each element represents a dictionary of recall for each class and the mean recall(float). The format is as follows.
Return type: (tuple)
Example of outputs.
    ){
        class_id1(int): recall(float),
        class_id2(int): recall(float),
    }, mean_recall(float))
report ( round_off=3 )

Outputs a table which shows precision, recall, F1 score, the number of true positive pixels and the number of ground truth pixels for each class.

Parameters: round_off ( int ) – The number of output decimal
Returns:
Precision recall F1 score #pred/#target
class_id1: 0.800 0.308 0.444 4/13
class_id2: 0.949 0.909 0.929 150/165
….
Average 0.364 0.500 0.421 742/1256
class EvaluatorDetection ( prediction , target , num_class=None )

Bases: renom_img.api.utility.evaluate.EvaluatorBase

Evaluator for object detection tasks

Parameters:
  • prediction ( list ) – A list of prediction results. The format is as follows
  • target ( list ) – A list of ground truth boxes and classes.
  • num_class ( int ) – The number of classes
Example of the arguments, “prediction” and “target”.
    [
        [ # Objects of 1st image.
            {'box': [x(float), y, w, h], 'class': class_id(int), 'score': score},
            {'box': [x(float), y, w, h], 'class': class_id(int), 'score': score},
            ...
        ],
        [ # Objects of 2nd image.
            {'box': [x(float), y, w, h], 'class': class_id(int), 'score': score},
            {'box': [x(float), y, w, h], 'clas': class_id(int), 'score': score},
            ...
        ]
    ]

Example

>>> evaluator = EvaluatorDetection(pred, gt)
>>> evaluator.mAP()
>>> evaluator.mean_iou()
AP ( iou_thresh=0.5 , round_off=3 )

Returns AP(Average Precision) for each class.

\(AP = 1/11 \sum_{r \in \{0.0,..1.0\}} AP_{r}\)

Parameters:
  • iou_thresh – IoU threshold. The default value is 0.5.
  • round_off ( int ) – The number of output decimal
Returns:

AP for each class. The format is as follows

Return type:

(dictionary)

{
    class_id1(int): AP1 (float),
    class_id2(int): AP2 (float),
    class_id3(int): AP3 (float),
}
iou ( iou_thresh=0.5 , round_off=3 )

Returns IoU for each class

Parameters:
  • iou_thresh ( float ) – IoU threshold. The default value is 0.5.
  • round_off ( int ) – The number of output decimal
Returns:

IoU for each class. The format is as follows

{
    class_id1(int): iou1 (float),
    class_id2(int): iou2 (float),
    class_id3(int): iou3 (float),
}

Return type:

(dictionary)

mAP ( iou_thresh=0.5 , round_off=3 )

Returns mAP (mean Average Precision)

Parameters:
  • iou_thresh ( float ) – IoU threshold. The default value is 0.5.
  • round_off ( int ) – The number of output decimal
Returns:

mAP(mean Average Precision).

Return type:

(float)

mean_iou ( iou_thresh=0.5 , round_off=3 )

Returns mean IoU for all classes

Parameters:
  • iou_thresh – IoU threshold. The default value is 0.5.
  • round_off ( int ) – The number of output decimal
Returns:

Mean IoU

Return type:

(float)

plot_pr_curve ( iou_thresh=0.5 , class_names=None )

Plot a precision-recall curve.

Parameters:
  • iou_thresh ( float ) – IoU threshold. The default value is 0.5.
  • class_names ( list ) – List of keys in a prediction list or string if you output precision-recall curve of only one class. This specifies which precision-recall curve of classes to output.
prec_rec ( iou_thresh=0.5 )

Return precision and recall for each class

Parameters: iou_thresh ( float ) – IoU threshold. Defaults to 0.5
Returns: 2 values are returned. Each element represents a dictionary of precision for each class and a dictionary of recall for each class. The format is as follows.
Return type: (tuple)
Example of outputs.
    ({
        class_id1(int): [precision1(float), precision2(float), ..],
        class_id2(int): [precision3(float), precision4(float), ..],
    },
    {
        class_id1(int): [recall1(float), recall2(float), ..]
        class_id2(int): [recall3(float), recall4(float), ..]
    })
report ( iou_thresh=0.5 , round_off=3 )

Outputs a table which shows AP, IoU, the number of predicted instances for each class, and the number of ground truth instances for each class.

Parameters:
  • iou_thresh ( flaot ) – IoU threshold. The default value is 0.5.
  • round_off ( int ) – The number of output decimal
Returns:

AP IoU #pred/#target
class_name1: 0.091 0.561 1/13
class_name2: 0.369 0.824 6/15
….
mAP / mean IoU 0.317 0.698 266/686

class EvaluatorSegmentation ( prediction , target , ignore_class=0 )

Bases: renom_img.api.utility.evaluate.EvaluatorBase

Evaluator for classification tasks

Parameters:
  • prediction ( list ) – A list of predicted class
  • target ( list ) – A list of target class. The format is as follows
  • ignore_class ( int ) – background class is ignored in the output table. defaults to 0.
Example of the arguments, “prediction” and “target”.
    [
        class_id1(int),
        class_id2(int),
        class_id3(int),
    ]

Example

>>> evaluator = EvaluatorSegmentation(prediction, target)
>>> evaluator.iou()
>>> evaluator.precision()
f1 ( round_off=3 )

Returns f1 for each class and mean f1 score

Parameters: round_off ( int ) – The number of output decimal
Returns: 2 values are returned. Each element represents a dictionary of F1 score for each class and mean F1 score(float).
Return type: (tuple)
iou ( round_off=3 )

Returns iou for each class

Parameters: round_off ( int ) – The number of output decimal
Returns: 2 values are returned. Each element represents a dictionary of IoU for each class and mean IoU (float).
Return type: (tuple)
precision ( round_off=3 )

Returns precision for each class

Parameters: round_off ( int ) – The number of output decimal
Returns: 2 values are returned. Each element represents a dictionary of precision for each class and the mean precision(float).
Return type: (tuple)
recall ( round_off=3 )

Returns recall for each class and mean recall

Parameters: round_off ( int ) – The number of output decimal
Returns: 2 values are returned. Each element represents a dicitonary of recall for each class and mean recall(float).
Return type: (tuple)
report ( round_off=3 )

Outputs a table which shows IoU, precision, recall, F1 score, the number of true positive pixels and the number of ground truth pixels for each class.

Parameters: round_off ( int ) – The number of output decimal
Returns:
IoU Precision recall F1 score #pred/#target
class_id1: 0.178 0.226 0.457 0.303 26094/571520
class_id2: 0.058 0.106 0.114 0.110 25590/224398
….
Average 0.317 0.698 0.404 0.259 5553608/18351769

renom_img.api.utility.distributor

class ImageDistributor ( img_path_list , label_list=None , target_builder=None , augmentation=None , imsize=None , num_worker=3 )

Bases: renom_img.api.utility.distributor.distributor.ImageDistributorBase

batch ( batch_size , target_builder=None , shuffle=True )
Parameters:
  • batch_size ( int ) – batch size
  • target_builder ( ImageDistributor ) – target builder
  • shuffle ( bool ) – shuffle or not when splitting data
Yields:

(path of images(list), path of labels(list)

split ( ratio , shuffle=True )

split image and laebls

Parameters:
  • ratio ( float ) – ratio between training set and validation set
  • shuffle ( bool ) – shuffle or not when splitting data

renom_img.api.utility.visualize.grad_cam

class GuidedGradCam ( model_cam )

Bases: object

Guided Grad-cam implementation for visualizing CNN classification model feature map importance

Parameters: model_cam ( ReNom model instance ) – CNN-based classification model to be used for creating Guided Grad-CAM saliency maps. Model must be ReNom instance of VGG, ResNet, ResNeXt or rm.Sequential. Model should use ReLu activation functions and be pre-trained on the same dataset used for Grad-CAM visualizations.
Returns: Guided backpropagation array, Grad-CAM(++) saliency map array, Guided Grad-CAM(++) array
Return type: (numpy.ndarray)

Example

>>> #This sample uses matplotlib to display files, so it is recommended to run this inside a Jupyter Notebook
>>>
>>> import renom as rm
>>> import numpy as np
>>> from PIL import Image
>>> import matplotlib.pyplot as plt
>>> from matplotlib.pyplot import cm
>>> from renom_img.api.classification.vgg import VGG16
>>> from renom_img.api.utility.visualize.grad_cam import GuidedGradCam
>>> from renom_img.api.utility.visualize.tools import load_img, preprocess_img, visualize_grad_cam
>>>
>>> model = VGG16()
>>>
>>> #Provide pre-trained model weights for same dataset you are producing Grad-CAM visualizations on
>>> model.load("my_pretrained_weights.h5")
>>>
>>> #Create Grad-CAM instance based on pre-trained model (VGG, ResNet, ResNeXt, or rm.Sequential)
>>> grad_cam = GuidedGradCam(model)
>>>
>>> #Provide path to image file for producing Grad-CAM visualizations
>>> img_path = '/home/username/path/to/images/cat_dog.jpg'
>>>
>>> #Load and pre-process image (must be same pre-processing as used during training)
>>> img = Image.open(img_path)
>>> size=(224,224)
>>> img = load_img(img_path, size)
>>> x = preprocess_img(img)
>>>
>>> #Select class_id (index of array in model's final output) to produce visualizations for. Must be consistent with class ID in trained model.
>>> class_id = 243
>>>
>>> #Generate Grad-CAM maps
>>> input_map, L, result = grad_cam(x, size, class_id=class_id, mode='normal')
>>>
>>> #Overlay Grad-CAM saliency map on image using matplotlib
>>> plt.imshow(img)
>>> plt.imshow(L, cmap = cm.jet, alpha = 0.6)
>>> plt.axis("off")
>>> plt.savefig("grad_cam_sample.png", bbox_inches='tight', pad_inches=0)
>>>
>>> #Visualize Guided Grad-CAM (original image, guided backpropagation, Grad-CAM saliency map, Guided Grad-CAM visualization)
>>> visualize_grad_cam(img, input_map, L, result)
>>>
>>> #Generate Grad-CAM++ maps
>>> input_map, L, result = grad_cam(x, size, class_id=class_id, mode='plus')
>>> #Visualize results (original image, guided backpropagation, Grad-CAM++ saliency map, Guided Grad-CAM++ visualization)
>>> visualize_grad_cam(img, input_map, L, result)

References

Ramprasaath R. Selvaraju, Michael Cogswell, Abhishek Das, Ramakrishna Vedantam, Devi Parikh, Dhruv Batra
Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization

Aditya Chattopadhyay, Anirban Sarkar, Prantik Howlader, Vineeth N Balasubramanian
Grad-CAM++: Improved Visual Explanations for Deep Convolutional Networks

check_for_relus ( model )

Assertion check to see if ReLu activation functions exist in model

Parameters: model_cam ( ReNom model instance ) – CNN-based classification model to be used for creating Guided Grad-CAM saliency maps Model must be ReNom instance of VGG, ResNet, ResNeXt or rm.Sequential. Model should use ReLu activation functions and be pre-trained on the same dataset used for Grad-CAM visualizations.
Returns: assert result
Return type: (bool)
forward_cam ( x , class_id , mode , node )

Calculates forward pass through model for Grad-CAM

Parameters:
  • x ( renom.Variable ) – Input data for model after pre-processing has been applied
  • class_id ( int ) – Class ID for creating visualizations
  • mode ( string ) – Flag for selecting Grad-CAM or Grad-CAM++
  • node ( int ) – Index representing final convolutional layer (used in rm.Sequential case only)
Returns:

Final layer output and final convolution layer output

Return type:

(renom.Variable)

forward_gb ( x_gb , class_id , mode )

Calculates forward pass through model for guided backpropagation

Parameters:
  • x_gb ( renom.Variable ) – Input data for model after pre-processing has been applied
  • class_id ( int ) – Class ID for creating visualizations
  • mode ( string ) – Flag for selecting Grad-CAM or Grad-CAM++
Returns:

Final layer output

Return type:

(renom.Variable)

generate_map ( y_c , final_conv , gb_map , mode , size )

Generates Guided Grad-CAM and Grad-CAM saliency maps as numpy arrays

Parameters:
  • y_c ( renom.Variable ) – Output of final layer in forward pass through model
  • final_conv ( renom.Variable ) – Output of final convolution layer in forward pass through model
  • gb_map ( numpy.ndarray ) – numpy array representing normalized guided backpropagation output
  • mode ( string ) – Flag for selecting Grad-CAM (‘normal’, default) or Grad-CAM++ (‘plus’)
Returns:

Grad-CAM saliency map and Guided Grad-CAM map as numpy arrays

Return type:

(numpy.ndarray)

get_model_type ( model_cam )

Gets model type information for model passed to Grad-CAM

Parameters: model_cam ( ReNom model instance ) – CNN-based classification model to be used for creating Guided Grad-CAM saliency maps Model must be ReNom instance of VGG, ResNet, ResNeXt or rm.Sequential. Model should use ReLu activation functions and be pre-trained on the same dataset used for Grad-CAM visualizations.
Returns: Model class name
Return type: (string)
get_predicted_class ( x )

Returns class that model predicts given input data

Parameters:
  • x ( renom.Variable ) – Input data for model after pre-processing has been applied
  • class_id ( int ) – Class ID for creating visualizations
  • mode ( string ) – Flag for selecting Grad-CAM (‘normal’, default) or Grad-CAM++ (‘plus’)
Returns:

np.argmax index of final model output

Return type:

(int)

get_scaling_factor ( size , L )

Calculates scaling factor for aligning Grad-CAM map and input image sizes

Parameters:
  • size ( tuple ) – tuple of integers representing original image size
  • L ( ndarray ) – Grad-CAM saliency map
Returns:

width and height scaling factors for aligning final array sizes

Return type:

float, float

guided_backprop ( x_gb , y_gb )

Calculates guided backpropagation backward pass

Parameters:
  • x_gb ( renom.Variable ) – Input data for model after pre-processing has been applied
  • y_gb ( renom.Variable ) – Output of guided backpropagaion forward pass for x_gb
Returns:

Raw and normalized guided backpropagation outputs

Return type:

(numpy.ndarray)

renom_img.api.utility.misc

draw_box ( img , prediction , show_size=None , font_path=None , color_list=None )

Function for describing bounding box, class name and socre for an input image.

Parameters:
  • img ( string , ndarray ) – An path of image or image array.
  • prediction ( list ) – List of annotations. Each annotation has a list of dictionary which includes keys box , name and score . The format is below.
[
    {'box': [x(float), y, w, h], 'name': class name(string), 'score': score(float)},
    {'box': [x(float), y, w, h], 'name': class name(string), 'score': score(float)},
    ...
]

font_path(string): Path to font file for showing object's name. If None is given, default font will be used.
color_list(list): A list of color for rendering bounding boxes. If None is given, default color list will be used.
Returns: This returns image described bounding box.
Return type: (PIL.Image)

Example

>>> from PIL import Image
>>> from renom_img.api.utility.load import *
>>> prediction = parse_xml_detection(prediction_xml_path_list)[0]
>>> bbox_image = draw_bbox(img_path, prediction)

Note

The values of box is a relational coordinate so their values are in [0.0 ~ 1.0]. If you pass the argument img as ndarray, it must have the format of (channel, height, width). For example, an RGB color which size is (100, 10), the matrix will be (3, 10, 100).

draw_segment ( img , prediction , color_list=None , show_background=True )

Function for draw segment according to the argument prediction .

Parameters:
  • img ( string , ndarray ) – An path of image or image array.
  • prediction ( ndarray ) – List of predicted annotations. This must be a matrix which size equals to image.
  • color_list ( list ) – A list of color for rendering bounding boxes. If None is given, default color list will be used.
  • show_background ( bool ) – If this is false, background class whose id is 0 will not be drawn.
Returns:

This returns image described prediction result.

Return type:

(PIL.Image)

Example

>>> from PIL import Image
>>> prediction = Image.open(predicticted_result)
>>> image = Image.open(img_path)
>>> bbox_image = draw_segment(img_path, prediction)

Note

If you pass the argument img as ndarray, it must have the format of (channel, height, width). Same as it, the argument prediction must be a matrix which format is (channel, height, width). For example, an RGB color which size is (100, 10), the matrix will be (3, 10, 100).

pil2array ( img )

Function for convert PIL image to numpy array.

Example

>>> from renom_img.api.utility.misc.display import pil2array
>>> from PIL import Image
>>> img = Image.open(img_path)
>>> converted_img = pil2array(img)
Parameters: img ( PIL.Image ) – PIL Image
Returns: This returns numpy array object.
Return type: (numpy.ndarray)