renom_img.api.utility

renom_img.api.utility.box

rescale ( box , before_size , after_size )

Rescale box coordinates and size to specific size.

Parameters:
  • box ( list ) – This list has 4 elements that represent above coordinates.
  • before_size ( float ) – Size of the box before rescaling.
  • after_size ( float ) – Size of the box before rescaling.
transform2xywh ( box )

This function changes box’s coordinate format from (x1, y1, x2, y2) to (x, y, w, h).

( x1 , y1 ) represents the coordinate of upper left corner. ( x2 , y2 ) represents the coordinate of lower right corner.

( x , y ) represents the center of bounding box. ( w , h ) represents the width and height of bonding box.

The format of argument box have to be following example.

[x1(float), y1(float), x2(float), y2(float)]
Parameters: box ( list ) – This list has 4 elements that represent above coordinates.
Returns: Returns reformatted bounding box.
Return type: (list)
transform2xy12 ( box )

This function changes box’s coordinate format from (x, y, w, h) to (x1, y1, x2, y2).

( x , y ) represents the center of bonding box. ( w , h ) represents the width and height of bonding box.

( x1 , y1 ) represents the coordinate of upper left corner. ( x2 , y2 ) represents the coordinate of lower right corner.

The format of argument box have to be following example.

[x(float), y(float), w(float), h(float)]
Parameters: box ( list ) – This list has 4 elements that represent above coordinates.
Returns: Returns reformatted bounding box.
Return type: (list)
calc_iou_xyxy ( box1 , box2 )

This function calculates IOU in the coordinate format (x, y, w, h).

( x , y ) represents the coordinate of the center. ( w , h ) represents the width and height.

The format of argument box have to be following example.

[x(float), y(float), w(float), h(float)]
Parameters:
  • box1 ( list ) – List of a box. The list has 4 elements that represent above coordinates.
  • box2 ( list ) – List of a box. The list has 4 elements that represent above coordinates.
Returns:

Returns value of IOU.

Return type:

(float)

calc_iou_xywh ( box1 , box2 )

This function calculates IOU in the coordinate format (x1, y1, x2, y2).

( x1 , y1 ) represents the coordinate of upper left corner. ( x2 , y2 ) represents the coordinate of lower right corner.

The format of argument box have to be following example.

[x1(float), y1(float), x2(float), y2(float)]
Parameters:
  • box1 ( list ) – List of a box. The list has 4 elements that represent above coordinates.
  • box2 ( list ) – List of a box. The list has 4 elements that represent above coordinates.
Returns:

Returns value of IOU.

Return type:

(float)

renom_img.api.utility.load

parse_xml_detection ( xml_path_list , num_thread=8 )

XML format must be Pascal VOC format.

Parameters:
  • xml_path_list ( list ) – List of xml-file’s path.
  • num_thread ( int ) – Number of thread for parsing xml files.
Returns:

This returns list of annotations. Each annotation has a list of dictionary which includes keys ‘box’ and ‘name’. The structure is bellow.

Return type:

(list)

# An example of returned list.
[
    [ # Objects of 1st image.
        {'box': [x(float), y, w, h], 'name': class_name(string), 'class': id(int)},
        {'box': [x(float), y, w, h], 'name': class_name(string), 'class': id(int)},
        ...
    ],
    [ # Objects of 2nd image.
        {'box': [x(float), y, w, h], 'name': class_name(string), 'class': id(int)},
        {'box': [x(float), y, w, h], 'name': class_name(string), 'class': id(int)},
        ...
    ]
]

renom_img.api.utility.nms

nms ( )

Non-Maximum Suppression

Parameters:
  • preds ( list ) – A list of predicted boxes. The format is as follows.
  • threshold ( float , optional ) – Defaults to 0.5 . This represents the ratio of overlap between two boxes.
Example of the argument “preds”.
[
    [ # Objects of 1st image.
        {'box': [x(float), y, w, h], 'class': class_id(int), 'score': score},
        {'box': [x(float), y, w, h], 'class': class_id(int), 'score': score},
        ...
    ],
    [ # Objects of 2nd image.
        {'box': [x(float), y, w, h], 'class': class_id(int), 'score': score},
        {'box': [x(float), y, w, h], 'class': class_id(int), 'score': score},
        ...
    ]
]
Returns: Returns reformatted bounding box.
Return type: (list)
Example of return value.
[
    [ # Objects of 1st image.
        {'box': [x(float), y, w, h], 'class': class_id(int), 'score': score},
        {'box': [x(float), y, w, h], 'class': class_id(int), 'score': score},
        ...
    ],
    [ # Objects of 2nd image.
        {'box': [x(float), y, w, h], 'class': class_id(int), 'score': score},
        {'box': [x(float), y, w, h], 'class': class_id(int), 'score': score},
        ...
    ]
]
soft_nms ( )

Soft Non-Maximum Suppression

Parameters:
  • preds ( list ) – A list of predicted boxes. The format is as follows.
  • threshold ( float , optional ) – Defaults to 0.5 . This represents the ratio of overlap between two boxes.
Example of the argument, “preds”.
    [
        [ # Objects of 1st image.
            {'box': [x(float), y, w, h], 'class': class_id(int), 'score': score},
            {'box': [x(float), y, w, h], 'class': class_id(int), 'score': score},
            ...
        ],
        [ # Objects of 2nd image.
            {'box': [x(float), y, w, h], 'class': class_id(int), 'score': score},
            {'box': [x(float), y, w, h], 'class': class_id(int), 'score': score},
            ...
        ]
    ]
Returns: Returns reformatted bounding box.
Return type: (list)
Example of the output.
    [
        [ # Objects of 1st image.
            {'box': [x(float), y, w, h], 'class': class_id(int), 'score': score},
            {'box': [x(float), y, w, h], 'class': class_id(int), 'score': score},
            ...
        ],
        [ # Objects of 2nd image.
            {'box': [x(float), y, w, h], 'class': class_id(int), 'score': score},
            {'box': [x(float), y, w, h], 'class': class_id(int), 'score': score},
            ...
        ]
    ]

References

Navaneeth Bodla, Bharat Singh, Rama Chellappa, Larry S. Davis,
Soft-NMS – Improving Object Detection With One Line of Code

renom_img.api.utility.target

class DataBuilderClassification ( class_map , imsize )

Bases: renom_img.api.utility.target.DataBuilderBase

Data builder for a classification task

Parameters:
  • class_map ( array ) – Array of class names
  • imsize ( int or tuple ) – Input image size
build ( img_path_list , annotation_list , augmentation=None , **kwargs )

Builds an array of images and corresponding labels

Parameters:
  • img_path_list ( list ) – List of input image paths.
  • annotation_list ( list ) – List of class id [1, 4, 6 (int)]
  • augmentation ( Augmentation ) – Instance of the augmentation class.
Returns:

Batch of images and corresponding one hot labels for each image in a batch

Return type:

(tuple)

class DataBuilderDetection ( class_map , imsize )

Bases: renom_img.api.utility.target.DataBuilderBase

Data builder for a detection task

Parameters:
  • class_map ( array ) – Array of class names
  • imsize ( int or tuple ) – Input image size
build ( img_path_list , annotation_list , augmentation=None , **kwargs )
Parameters:
  • img_path_list ( list ) – List of input image paths.
  • annotation_list ( list ) – The format of annotation list is as follows.
  • augmentation ( Augmentation ) – Instance of the augmentation class.
Returns:

Batch of images and ndarray whose shape is (# images, maximum number of objects in an image * (4(coordinates) + 1(confidence)))

Return type:

(tuple)

class DataBuilderSegmentation ( class_map , imsize )

Bases: renom_img.api.utility.target.DataBuilderBase

Data builder for a semantic segmentation task

Parameters:
  • class_map ( array ) – Array of class names
  • imsize ( int or tuple ) – Input image size
build ( img_path_list , annotation_list , augmentation=None , **kwargs )
Parameters:
  • img_path_list ( list ) – List of input image paths.
  • annotation_list ( list ) – The format of annotation list is as follows.
  • augmentation ( Augmentation ) – Instance of the augmentation class.
Returns:

Batch of images and ndarray whose shape is (batch size, #classes, width, height)

Return type:

(tuple)

crop_to_square ( image )
load_annotation ( path )

Loads annotation data

Parameters: path – A path of annotation file
Returns: Returns annotation data(numpy.array), the ratio of the given width to the actual image width,
Return type: (tuple)
load_img ( path )

Loads an image

Parameters: path ( str ) – A path of an image
Returns:
Returns image(numpy.array), the ratio of the given width to the actual image width,
and the ratio of the given height to the actual image height
Return type: (tuple)

renom_img.api.utility.augmentation

class Augmentation ( process_list )

Bases: object

This class is for applying augmentation to images.
Instance of augmentation is passed to ImageDistributor module,
and is called only when training process is runnning.
You could choose augmentation methods from Process module.
Parameters: process_list ( list of Process modules ) – list of Process modules. You could choose from Flip, Shift, Rotate and WhiteNoise

Example

>>> from renom_img.api.utility.augmentation import Augmentation
>>> from renom_img.api.utility.augmentation.process import Flip, Shift, Rotate, WhiteNoise
>>> from renom_img.api.utility.distributor.distributor import ImageDistributor
>>> aug = Augmentation([
...     Shift(40, 40),
...     Rotate(),
...     Flip(),
...     WhiteNoise()
... ])
>>> distributor = ImageDistributor(
...     img_path_list,
...     label_list,
...     builder,
...     aug,
...     num_worker
... )
transform ( x , y=None , mode='classification' )

This function is for applying augmentation to ImageDistributor

Parameters:
  • x ( list of str ) – List of path of images.
  • y ( list of annotation ) – list of annotation for x. It is only used when prediction.
Returns:

list of transformed images and list of annotation for x.

[
    x (list of numpy.ndarray), # List of transformed images.
    y (list of annotation) # list of annotation for x.
]

Return type:

tupple

flip ( x , y=None , mode='classification' )

Flip image randomly.

Parameters:
  • x ( list of str ) – List of path of images.
  • y ( list of annotation ) – list of annotation for x. It is only used when prediction.
Returns:

list of transformed images and list of annotation for x.

Return type:

tupple

[
    x (list of numpy.ndarray), # List of transformed images.
    y (list of annotation) # list of annotation for x.
]

Examples

>>> from renom_img.api.utility.augmentation.process import Flip
>>> from PIL import Image
>>>
>>> img1 = Image.open(img_path1)
>>> img2 = Image.open(img_path2)
>>> img_list = np.array([img1, img2])
>>> flipped_img = flip(img_list)
shift ( x , y=None , horizontal=10 , vertivcal=10 , mode='classification' )

Shift images randomly according to given parameter.

Parameters:
  • x ( list of str ) – List of path of images.
  • y ( list of annotation ) – list of annotation for x. It is only used when prediction.
Returns:

list of transformed images and list of annotation for x.

Return type:

tupple

[
    x (list of numpy.ndarray), # List of transformed images.
    y (list of annotation) # list of annotation for x.
]

Examples

>>> from renom_img.api.utility.augmentation.process import shift
>>> from PIL import Image
>>>
>>> img1 = Image.open(img_path1)
>>> img2 = Image.open(img_path2)
>>> img_list = np.array([img1, img2])
>>> shifted_img = shift(img_list)
rotate ( x , y=None , mode='classification' )

Rotate images randomly from 0, 90, 180, 270 degree.

Parameters:
  • x ( list of str ) – List of path of images.
  • y ( list of annotation ) – list of annotation for x. It is only used when prediction.
Returns:

list of transformed images and list of annotation for x.

Return type:

tupple

[
    x (list of numpy.ndarray), # List of transformed images.
    y (list of annotation) # list of annotation for x.
]

Examples

>>> from renom_img.api.utility.augmentation.process import rotate
>>> from PIL import Image
>>>
>>> img1 = Image.open(img_path1)
>>> img2 = Image.open(img_path2)
>>> img_list = np.array([img1, img2])
>>> rotated_img = rotate(img_list)
white_noise ( x , y=None , std=0.01 , mode='classification' )

Add white noise to images.

Parameters:
  • x ( list of str ) – List of path of images.
  • y ( list of annotation ) – list of annotation for x. It is only used when prediction.
Returns:

list of transformed images and list of annotation for x.

[
    x (list of numpy.ndarray), # List of transformed images.
    y (list of annotation) # list of annotation for x.
]

Return type:

tupple

Examples

>>> from renom_img.api.utility.augmentation.process import white_noise
>>> from PIL import Image
>>>
>>> img1 = Image.open(img_path1)
>>> img2 = Image.open(img_path2)
>>> img_list = np.array([img1, img2])
>>> noise_img = white_noise(img_list)
contrast_norm ( x , y=None , alpha=0.5 , per_channel=False , mode='classification' )

Contrast Normalization

Parameters:
  • x ( list of str ) – List of path of images.
  • y ( list of annotation ) – list of annotation for x. It is only used when prediction.
  • alpha ( float or list of two floats ) – Higher value increases contrast, and lower value decreases contrast. if a list [a, b], alpha value is sampled from uniform distribution ranging from [a, b). if a float, constant value of alpha is used.
  • per_channel ( Bool ) – Whether to apply contrast normalization for each channel. If alpha is given a list, then different values for each channel are used.
Returns:

list of transformed images and list of annotation for x.

Return type:

tupple

[
    x (list of numpy.ndarray), # List of transformed images.
    y (list of annotation) # list of annotation for x.
]

Example

>>> img = Image.open(img_path)
>>> img.convert('RGB')
>>> img = np.array(img).transpose(2, 0, 1).astype(np.float)
>>> x = np.array([img])
>>> new_x, new_y = contrast_norm(x, alpha=0.4)

renom_img.api.utility.evaluate

class EvaluatorClassification ( prediction , target )

Bases: renom_img.api.utility.evaluate.EvaluatorBase

Evaluator for classification tasks

Parameters:
  • prediction ( list ) – A list of predicted class
  • target ( list ) – A list of target class. The format is as follows
Example of the arguments, “prediction” and “target”.
    [
        class_id1(int),
        class_id2(int),
        class_id3(int),
    ]

Example

>>> evaluator = EvaluatorClassification(prediction, target)
>>> evaluator.precision()
>>> evaluator.recall()
accuracy ( )

Returns accuracy.

Returns: Accuracy
Return type: (float)
f1 ( )

Returns f1 for each class and mean f1 score.

Returns: 2 values are returned. Each element represents a dictionary of F1 score for each class and mean F1 score(float). The format is as follows.
Return type: (tuple)
Example of outputs.
    ({
        class_id1(int): f1 score(float),
        class_id2(int): f1_score(float)
    }, mean_f1_score(float))
precision ( )

Returns precision for each class and mean precision

Returns: 2 values are returned. Each element represents a dictioanry of precision for each class and the mean precision(float). The format is as follows.
Return type: (tuple)
Example of outputs.
    ({
        class_id1(int): precision(float),
        class_id2(int): precision(float),
    }, mean_precision(float))
recall ( )

Returns recall for each class and mean recall

Returns: 2 values are returned. Each element represents a dictionary of recall for each class and the mean recall(float). The format is as follows.
Return type: (tuple)
Example of outputs.
    ){
        class_id1(int): recall(float),
        class_id2(int): recall(float),
    }, mean_recall(float))
report ( round_off=3 )

Outputs a table which shows precision, recall, F1 score, the number of true positive pixels and the number of ground truth pixels for each class.

Parameters: round_off ( int ) – The number of output decimal
Returns:
Precision recall F1 score #pred/#target
class_id1: 0.800 0.308 0.444 4/13
class_id2: 0.949 0.909 0.929 150/165
….
Average 0.364 0.500 0.421 742/1256
class EvaluatorDetection ( prediction , target , num_class=None )

Bases: renom_img.api.utility.evaluate.EvaluatorBase

Evaluator for object detection tasks

Parameters:
  • prediction ( list ) – A list of prediction results. The format is as follows
  • target ( list ) – A list of ground truth boxes and classes.
  • num_class ( int ) – The number of classes
Example of the arguments, “prediction” and “target”.
    [
        [ # Objects of 1st image.
            {'box': [x(float), y, w, h], 'class': class_id(int), 'score': score},
            {'box': [x(float), y, w, h], 'class': class_id(int), 'score': score},
            ...
        ],
        [ # Objects of 2nd image.
            {'box': [x(float), y, w, h], 'class': class_id(int), 'score': score},
            {'box': [x(float), y, w, h], 'clas': class_id(int), 'score': score},
            ...
        ]
    ]

Example

>>> evaluator = EvaluatorDetection(pred, gt)
>>> evaluator.mAP()
>>> evaluator.mean_iou()
AP ( iou_thresh=0.5 , round_off=3 )

Returns AP(Average Precision) for each class.

\(AP = 1/11 \sum_{r \in \{0.0,..1.0\}} AP_{r}\)

Parameters:
  • iou_thresh – IoU threshold. The default value is 0.5.
  • round_off ( int ) – The number of output decimal
Returns:

AP for each class. The format is as follows

Return type:

(dictionary)

{
    class_id1(int): AP1 (float),
    class_id2(int): AP2 (float),
    class_id3(int): AP3 (float),
}
iou ( iou_thresh=0.5 , round_off=3 )

Returns IoU for each class

Parameters:
  • iou_thresh ( float ) – IoU threshold. The default value is 0.5.
  • round_off ( int ) – The number of output decimal
Returns:

IoU for each class. The format is as follows

{
    class_id1(int): iou1 (float),
    class_id2(int): iou2 (float),
    class_id3(int): iou3 (float),
}

Return type:

(dictionary)

mAP ( iou_thresh=0.5 , round_off=3 )

Returns mAP (mean Average Precision)

Parameters:
  • iou_thresh ( float ) – IoU threshold. The default value is 0.5.
  • round_off ( int ) – The number of output decimal
Returns:

mAP(mean Average Precision).

Return type:

(float)

mean_iou ( iou_thresh=0.5 , round_off=3 )

Returns mean IoU for all classes

Parameters:
  • iou_thresh – IoU threshold. The default value is 0.5.
  • round_off ( int ) – The number of output decimal
Returns:

Mean IoU

Return type:

(float)

plot_pr_curve ( iou_thresh=0.5 , class_names=None )

Plot a precision-recall curve.

Parameters:
  • iou_thresh ( float ) – IoU threshold. The default value is 0.5.
  • class_names ( list ) – List of keys in a prediction list or string if you output precision-recall curve of only one class. This specifies which precision-recall curve of classes to output.
prec_rec ( iou_thresh=0.5 )

Return precision and recall for each class

Parameters: iou_thresh ( float ) – IoU threshold. Defaults to 0.5
Returns: 2 values are returned. Each element represents a dictionary of precision for each class and a dictionary of recall for each class. The format is as follows.
Return type: (tuple)
Example of outputs.
    ({
        class_id1(int): [precision1(float), precision2(float), ..],
        class_id2(int): [precision3(float), precision4(float), ..],
    },
    {
        class_id1(int): [recall1(float), recall2(float), ..]
        class_id2(int): [recall3(float), recall4(float), ..]
    })
report ( iou_thresh=0.5 , round_off=3 )

Outputs a table which shows AP, IoU, the number of predicted instances for each class, and the number of ground truth instances for each class.

Parameters:
  • iou_thresh ( flaot ) – IoU threshold. The default value is 0.5.
  • round_off ( int ) – The number of output decimal
Returns:

AP IoU #pred/#target
class_name1: 0.091 0.561 1/13
class_name2: 0.369 0.824 6/15
….
mAP / mean IoU 0.317 0.698 266/686

class EvaluatorSegmentation ( prediction , target , ignore_class=0 )

Bases: renom_img.api.utility.evaluate.EvaluatorBase

Evaluator for classification tasks

Parameters:
  • prediction ( list ) – A list of predicted class
  • target ( list ) – A list of target class. The format is as follows
  • ignore_class ( int ) – background class is ignored in the output table. defaults to 0.
Example of the arguments, “prediction” and “target”.
    [
        class_id1(int),
        class_id2(int),
        class_id3(int),
    ]

Example

>>> evaluator = EvaluatorSegmentation(prediction, target)
>>> evaluator.iou()
>>> evaluator.precision()
f1 ( round_off=3 )

Returns f1 for each class and mean f1 score

Parameters: round_off ( int ) – The number of output decimal
Returns: 2 values are returned. Each element represents a dictionary of F1 score for each class and mean F1 score(float).
Return type: (tuple)
iou ( round_off=3 )

Returns iou for each class

Parameters: round_off ( int ) – The number of output decimal
Returns: 2 values are returned. Each element represents a dictionary of IoU for each class and mean IoU (float).
Return type: (tuple)
precision ( round_off=3 )

Returns precision for each class

Parameters: round_off ( int ) – The number of output decimal
Returns: 2 values are returned. Each element represents a dictionary of precision for each class and the mean precision(float).
Return type: (tuple)
recall ( round_off=3 )

Returns recall for each class and mean recall

Parameters: round_off ( int ) – The number of output decimal
Returns: 2 values are returned. Each element represents a dicitonary of recall for each class and mean recall(float).
Return type: (tuple)
report ( round_off=3 )

Outputs a table which shows IoU, precision, recall, F1 score, the number of true positive pixels and the number of ground truth pixels for each class.

Parameters: round_off ( int ) – The number of output decimal
Returns:
IoU Precision recall F1 score #pred/#target
class_id1: 0.178 0.226 0.457 0.303 26094/571520
class_id2: 0.058 0.106 0.114 0.110 25590/224398
….
Average 0.317 0.698 0.404 0.259 5553608/18351769

renom_img.api.utility.distributor

class ImageDistributor ( img_path_list , label_list=None , target_builder=None , augmentation=None , imsize=None , num_worker=3 )

Bases: renom_img.api.utility.distributor.distributor.ImageDistributorBase

batch ( batch_size , target_builder=None , shuffle=True )
Parameters:
  • batch_size ( int ) – batch size
  • target_builder ( ImageDistributor ) – target builder
  • shuffle ( bool ) – shuffle or not when splitting data
Yields:

(path of images(list), path of labels(list)

split ( ratio , shuffle=True )

split image and laebls

Parameters:
  • ratio ( float ) – ratio between training set and validation set
  • shuffle ( bool ) – shuffle or not when splitting data

renom_img.api.utility.misc

draw_box ( img , prediction , font_path=None , color_list=None )

Function for describing bounding box, class name and socre for an input image.

Parameters:
  • img ( string , ndarray ) – An path of image or image array.
  • prediction ( list ) – List of annotations. Each annotation has a list of dictionary which includes keys box , name and score . The format is below.
[
    {'box': [x(float), y, w, h], 'name': class name(string), 'score': score(float)},
    {'box': [x(float), y, w, h], 'name': class name(string), 'score': score(float)},
    ...
]

font_path(string): Path to font file for showing object's name. If None is given, default font will be used.
color_list(list): A list of color for rendering bounding boxes. If None is given, default color list will be used.
Returns: This returns image described bounding box.
Return type: (PIL.Image)

Example

>>> from PIL import Image
>>> from renom_img.api.utility.load import *
>>> prediction = parse_xml_detection(prediction_xml_path_list)[0]
>>> bbox_image = draw_bbox(img_path, prediction)

Note

The values of box is a relational coordinate so their values are in [0.0 ~ 1.0]. If you pass the argument img as ndarray, it must have the format of (channel, height, width). For example, an RGB color which size is (100, 10), the matrix will be (3, 10, 100).

draw_segment ( img , prediction , color_list=None , show_background=True )

Function for draw segment according to the argument prediction .

Parameters:
  • img ( string , ndarray ) – An path of image or image array.
  • prediction ( ndarray ) – List of predicted annotations. This must be a matrix which size equals to image.
  • color_list ( list ) – A list of color for rendering bounding boxes. If None is given, default color list will be used.
  • show_background ( bool ) – If this is false, background class whose id is 0 will not be drawn.
Returns:

This returns image described prediction result.

Return type:

(PIL.Image)

Example

>>> from PIL import Image
>>> prediction = Image.open(predicticted_result)
>>> image = Image.open(img_path)
>>> bbox_image = draw_segment(img_path, prediction)

Note

If you pass the argument img as ndarray, it must have the format of (channel, height, width). Same as it, the argument prediction must be a matrix which format is (channel, height, width). For example, an RGB color which size is (100, 10), the matrix will be (3, 10, 100).

pil2array ( img )

Function for convert PIL image to numpy array.

Example

>>> from renom_img.api.utility.misc.display import pil2array
>>> from PIL import Image
>>> img = Image.open(img_path)
>>> converted_img = pil2array(img)
Parameters: img ( PIL.Image ) – PIL Image
Returns: This returns numpy array object.
Return type: (numpy.ndarray)