renom_img.api.detection

class Yolov1 ( class_map=[] , cells=7 , bbox=2 , imsize=(224 , 224) , load_pretrained_weight=False , train_whole_network=False )

Yolo object detection algorithm.

Parameters:
  • num_class ( int ) – Number of class.
  • cells ( int or tuple ) – Cell size.
  • boxes ( int ) – Number of boxes.
  • imsize ( int , tuple ) – Image size.
  • load_pretrained_weight ( bool , str ) – If true, pretrained weight will be downloaded to current directory. If string is given, pretrained weight will be saved as given name.

References

Joseph Redmon, Santosh Divvala, Ross Girshick, Ali Farhadi
You Only Look Once: Unified, Real-Time Object Detection

build_data ( )

This function returns a function which creates input data and target data specified for Yolov1.

Returns: Returns function which creates input data and target data.
Return type: (function)

Example

>>> builder = model.build_data()  # This will return function.
>>> x, y = builder(image_path_list, annotation_list)
>>> z = model(x)
>>> loss = model.loss(z, y)
fit ( train_img_path_list , train_annotation_list , valid_img_path_list=None , valid_annotation_list=None , epoch=136 , batch_size=64 , augmentation=None , callback_end_epoch=None )

This function performs training with given data and hyper parameters.

Parameters:
  • train_img_path_list ( list ) – List of image path.
  • train_annotation_list ( list ) – List of annotations.
  • valid_img_path_list ( list ) – List of image path for validation.
  • valid_annotation_list ( list ) – List of annotations for validation.
  • epoch ( int ) – Number of training epoch.
  • batch_size ( int ) – Number of batch size.
  • augmentation ( Augmentation ) – Augmentation object.
  • callback_end_epoch ( function ) – Given function will be called at the end of each epoch.
Returns:

Training loss list and validation loss list.

Return type:

(tuple)

Example

>>> from renom_img.api.detection.yolo_v1 import Yolov1
>>> train_img_path_list, train_annot_list = ... # Define own data.
>>> valid_img_path_list, valid_annot_list = ...
>>> model = Yolov1()
>>> model.fit(
...     # Feeds image and annotation data.
...     train_img_path_list,
...     train_annot_list,
...     valid_img_path_list,
...     valid_annot_list,
...     epoch=8,
...     batch_size=8)
>>>

Following arguments will be given to the function callback_end_epoch .

  • epoch (int) - Number of current epoch.
  • model (Model) - Yolo1 object.
  • avg_train_loss_list (list) - List of average train loss of each epoch.
  • avg_valid_loss_list (list) - List of average valid loss of each epoch.
forward ( x )

Performs forward propagation. This function can be called using __call__ method. See following example of method usage.

Parameters: x ( ndarray , Node ) – Input image as an tensor.
Returns: Returns raw output of yolo v1. You can reform it to bounding box form using the method get_bbox .
Return type: (Node)

Example

>>> import numpy as np
>>> from renom_img.api.detection.yolo_v1 import Yolov1
>>>
>>> x = np.random.rand(1, 3, 224, 224)
>>> class_map = ["dog", "cat"]
>>> model = Yolov1(class_map)
>>> y = model.forward(x) # Forward propagation.
>>> y = model(x)  # Same as above result.
>>>
>>> bbox = model.get_bbox(y) # The output can be reformed using get_bbox method.
get_bbox ( z , score_threshold=0.3 , nms_threshold=0.4 )
Parameters:
  • z ( ndarray ) – Output array of neural network. The shape of array
  • score_threshold ( float ) – The threshold for confidence score. Predicted boxes which have lower confidence score than the threshold are discarderd. Defaults to 0.3
  • nms_threshold ( float ) – The threshold for non maximum supression. Defaults to 0.4
Returns:

List of predicted bbox, score and class of each image. The format of return value is bellow. Box coordinates and size will be returned as ratio to the original image size. Therefore the range of ‘box’ is [0 ~ 1].

Return type:

(list)

# An example of return value.
[
    [ # Prediction of first image.
        {'box': [x, y, w, h], 'score':(float), 'class':(int), 'name':(str)},
        {'box': [x, y, w, h], 'score':(float), 'class':(int), 'name':(str)},
        ...
    ],
    [ # Prediction of second image.
        {'box': [x, y, w, h], 'score':(float), 'class':(int), 'name':(str)},
        {'box': [x, y, w, h], 'score':(float), 'class':(int), 'name':(str)},
        ...
    ],
    ...
]

Example

>>> z = model(x)
>>> model.get_bbox(z)
[[{'box': [0.21, 0.44, 0.11, 0.32], 'score':0.823, 'class':1, 'name':'dog'}],
 [{'box': [0.87, 0.38, 0.84, 0.22], 'score':0.423, 'class':0, 'name':'cat'}]]

Note

Box coordinate and size will be returned as ratio to the original image size. Therefore the range of ‘box’ is [0 ~ 1].

get_optimizer ( current_epoch=None , total_epoch=None , current_batch=None , total_batch=None )

Returns an instance of Optimizer for training Yolov1 algorithm.

If all argument(current_epoch, total_epoch, current_batch, total_batch) are given, an optimizer object which whose learning rate is modified according to the number of training iteration. Otherwise, constant learning rate is set.

Parameters:
  • current_epoch ( int ) – The number of current epoch.
  • total_epoch ( int ) – The number of total epoch.
  • current_batch ( int ) – The number of current batch.
  • total_batch ( int ) – The number of total batch.
Returns:

Optimizer object.

Return type:

(Optimizer)

loss ( x , y )

Loss function specified for yolov1.

Parameters:
  • x ( Node , ndarray ) – Output data of neural network.
  • y ( Node , ndarray ) – Target data.
Returns:

Loss between x and y.

Return type:

(Node)

Example

>>> z = model(x)
>>> model.loss(z, y)
predict ( img_list , score_threshold=0.3 , nms_threshold=0.4 )

This method accepts either ndarray and list of image path.

Parameters:
  • img_list ( string , list , ndarray ) – Path to an image, list of path or ndarray.
  • score_threshold ( float ) – The threshold for confidence score. Predicted boxes which have lower confidence score than the threshold are discarderd. Defaults to 0.3
  • nms_threshold ( float ) – The threshold for non maximum supression. Defaults to 0.4
Returns:

List of predicted bbox, score and class of each image. The format of return value is bellow. Box coordinates and size will be returned as ratio to the original image size. Therefore the range of ‘box’ is [0 ~ 1].

Return type:

(list)

# An example of return value.
[
    [ # Prediction of first image.
        {'box': [x, y, w, h], 'score':(float), 'class':(int), 'name':(str)},
        {'box': [x, y, w, h], 'score':(float), 'class':(int), 'name':(str)},
        ...
    ],
    [ # Prediction of second image.
        {'box': [x, y, w, h], 'score':(float), 'class':(int), 'name':(str)},
        {'box': [x, y, w, h], 'score':(float), 'class':(int), 'name':(str)},
        ...
    ],
    ...
]

Example

>>>
>>> model.predict(['img01.jpg', 'img02.jpg']])
[[{'box': [0.21, 0.44, 0.11, 0.32], 'score':0.823, 'class':1, 'name':'dog'}],
 [{'box': [0.87, 0.38, 0.84, 0.22], 'score':0.423, 'class':0, 'name':'cat'}]]

Note

Box coordinate and size will be returned as ratio to the original image size. Therefore the range of ‘box’ is [0 ~ 1].

preprocess ( x )

Image preprocess for Yolov1.

\(x_{new} = x*2/255 - 1\)

Parameters: x ( ndarray ) –
Returns: Preprocessed data.
Return type: (ndarray)
regularize ( )

Regularize term. You can use this function to add regularize term to loss function.

In Yolo v1, weight decay of 0.0005 will be added.

Example

>>> import numpy as np
>>> from renom_img.api.detection.yolo_v1 import Yolov1
>>> x = np.random.rand(1, 3, 224, 224)
>>> y = np.random.rand(1, (5*2+20)*7*7)
>>> model = Yolov1()
>>> loss = model.loss(x, y)
>>> reg_loss = loss + model.regularize() # Adding weight decay term.
class Yolov2 ( class_map=[] , anchor=None , imsize=(320 , 320) , load_pretrained_weight=False , train_whole_network=False )

Yolov2 object detection algorithm.

Parameters:
  • class_map ( list ) – List of class name.
  • anchor ( AnchorYolov2 ) – Anchors.
  • imsize ( list ) – Image size. This can be both image size ex):(320, 320) and list of image size ex):[(288, 288), (320, 320)]. If list of image size is given, the prediction method uses the last image size of the list for prediction.
  • load_pretrained_weight ( bool , string ) –
  • train_whole_network ( bool ) –

References

Joseph Redmon, Ali Farhadi
YOLO9000: Better, Faster, Stronger

Note

If you save this model using ‘save’ method, anchor information(anchor list and base size of them) will be saved. So when you load your own saved model, you don’t have to give the arguments ‘anchor’ and ‘anchor_size’.

build_data ( imsize_list=None )

This function returns a function which creates input data and target data specified for Yolov2.

Returns: Returns function which creates input data and target data.
Return type: (function)

Example

>>> builder = model.build_data()  # This will return function.
>>> x, y = builder(image_path_list, annotation_list)
>>> z = model(x)
>>> loss = model.loss(z, y)
fit ( train_img_path_list , train_annotation_list , valid_img_path_list=None , valid_annotation_list=None , epoch=160 , batch_size=16 , imsize_list=None , augmentation=None , callback_end_epoch=None )

This function performs training with given data and hyper parameters. Yolov2 is trained using multiple scale images. Therefore, this function requires list of image size. If it is not given, the model will be trained using fixed image size.

Parameters:
  • train_img_path_list ( list ) – List of image path.
  • train_annotation_list ( list ) – List of annotations.
  • valid_img_path_list ( list ) – List of image path for validation.
  • valid_annotation_list ( list ) – List of annotations for validation.
  • epoch ( int ) – Number of training epoch.
  • batch_size ( int ) – Number of batch size.
  • imsize_list ( list ) – List of image size.
  • augmentation ( Augmentation ) – Augmentation object.
  • callback_end_epoch ( function ) – Given function will be called at the end of each epoch.
Returns:

Training loss list and validation loss list.

Return type:

(tuple)

Example

>>> from renom_img.api.detection.yolo_v2 import Yolov2
>>> train_img_path_list, train_annot_list = ... # Define own data.
>>> valid_img_path_list, valid_annot_list = ...
>>> model = Yolov2()
>>> model.fit(
...     # Feeds image and annotation data.
...     train_img_path_list,
...     train_annot_list,
...     valid_img_path_list,
...     valid_annot_list,
...     epoch=8,
...     batch_size=8)
>>>

Following arguments will be given to the function callback_end_epoch .

  • epoch (int) - Number of current epoch.
  • model (Model) - Yolo2 object.
  • avg_train_loss_list (list) - List of average train loss of each epoch.
  • avg_valid_loss_list (list) - List of average valid loss of each epoch.
forward ( x )

Performs forward propagation. This function can be called using __call__ method. See following example of method usage.

Parameters: x ( ndarray , Node ) – Input image as an tensor.
Returns: Returns raw output of yolo v1. You can reform it to bounding box form using the method get_bbox .
Return type: (Node)

Example

>>> import numpy as np
>>> from renom_img.api.detection.yolo_v2 import Yolov2
>>>
>>> x = np.random.rand(1, 3, 224, 224)
>>> class_map = ["dog", "cat"]
>>> model = Yolov2(class_map)
>>> y = model.forward(x) # Forward propagation.
>>> y = model(x)  # Same as above result.
>>>
>>> bbox = model.get_bbox(y) # The output can be reformed using get_bbox method.
get_bbox ( z , score_threshold=0.3 , nms_threshold=0.4 )

Example

>>> z = model(x)
>>> model.get_bbox(z)
[[{'box': [0.21, 0.44, 0.11, 0.32], 'score':0.823, 'class':1, 'name':'dog'}],
 [{'box': [0.87, 0.38, 0.84, 0.22], 'score':0.423, 'class':0, 'name':'cat'}]]
Parameters: z ( ndarray ) – Output array of neural network. The shape of array
Returns: List of predicted bbox, score and class of each image. The format of return value is bellow. Box coordinates and size will be returned as ratio to the original image size. Therefore the range of ‘box’ is [0 ~ 1].
Return type: (list)
# An example of return value.
[
    [ # Prediction of first image.
        {'box': [x, y, w, h], 'score':(float), 'class':(int), 'name':(str)},
        {'box': [x, y, w, h], 'score':(float), 'class':(int), 'name':(str)},
        ...
    ],
    [ # Prediction of second image.
        {'box': [x, y, w, h], 'score':(float), 'class':(int), 'name':(str)},
        {'box': [x, y, w, h], 'score':(float), 'class':(int), 'name':(str)},
        ...
    ],
    ...
]

Note

Box coordinate and size will be returned as ratio to the original image size. Therefore the range of ‘box’ is [0 ~ 1].

get_optimizer ( current_epoch=None , total_epoch=None , current_batch=None , total_batch=None )

Returns an instance of Optimizer for training Yolov2 algorithm.

If all argument(current_epoch, total_epoch, current_batch, total_batch) are given, an optimizer object which whose learning rate is modified according to the number of training iteration. Otherwise, constant learning rate is set.

Parameters:
  • current_epoch ( int ) – The number of current epoch.
  • total_epoch ( int ) – The number of total epoch.
  • current_batch ( int ) – The number of current batch.
  • total_batch ( int ) – The number of total batch.
Returns:

Optimizer object.

Return type:

(Optimizer)

loss ( x , y )

Loss function specified for yolov2.

Parameters:
  • x ( Node , ndarray ) – Output data of neural network.
  • y ( Node , ndarray ) – Target data.
Returns:

Loss between x and y.

Return type:

(Node)

Example

>>> z = model(x)
>>> model.loss(z, y)
predict ( img_list , score_threshold=0.3 , nms_threshold=0.4 )

This method accepts either ndarray and list of image path.

Example

>>>
>>> model.predict(['img01.jpg', 'img02.jpg']])
[[{'box': [0.21, 0.44, 0.11, 0.32], 'score':0.823, 'class':1, 'name':'dog'}],
 [{'box': [0.87, 0.38, 0.84, 0.22], 'score':0.423, 'class':0, 'name':'cat'}]]
Parameters:
  • img_list ( string , list , ndarray ) – Path to an image, list of path or ndarray.
  • score_threshold ( float ) – The threshold for confidence score. Predicted boxes which have lower confidence score than the threshold are discarderd. Defaults to 0.3
  • nms_threshold ( float ) – The threshold for non maximum supression. Defaults to 0.4
Returns:

List of predicted bbox, score and class of each image. The format of return value is bellow. Box coordinates and size will be returned as ratio to the original image size. Therefore the range of ‘box’ is [0 ~ 1].

Return type:

(list)

# An example of return value.
[
    [ # Prediction of first image.
        {'box': [x, y, w, h], 'score':(float), 'class':(int), 'name':(str)},
        {'box': [x, y, w, h], 'score':(float), 'class':(int), 'name':(str)},
        ...
    ],
    [ # Prediction of second image.
        {'box': [x, y, w, h], 'score':(float), 'class':(int), 'name':(str)},
        {'box': [x, y, w, h], 'score':(float), 'class':(int), 'name':(str)},
        ...
    ],
    ...
]

Note

Box coordinate and size will be returned as ratio to the original image size. Therefore the range of ‘box’ is [0 ~ 1].

preprocess ( x )

Image preprocess for Yolov2.

\(x_{new} = x*2/255 - 1\)

Parameters: x ( ndarray ) –
Returns: Preprocessed data.
Return type: (ndarray)
regularize ( )

Regularize term. You can use this function to add regularize term to loss function.

In Yolo v2, weight decay of 0.0005 will be added.

Example

>>> import numpy as np
>>> from renom_img.api.detection.yolo_v2 import Yolov2
>>> x = np.random.rand(1, 3, 224, 224)
>>> y = np.random.rand(1, (5*2+20)*7*7)
>>> model = Yolov2()
>>> loss = model.loss(x, y)
>>> reg_loss = loss + model.regularize() # Adding weight decay term.