lavfi/dnn_classify: add filter dnn_classify for classification based on detection bounding boxes

classification is done on every detection bounding box in frame's side data, which are the results of object detection (filter dnn_detect). Please refer to commit log of dnn_detect for the material for detection, and see below for classification. - download material for classifcation: wget https://github.com/guoyejun/ffmpeg_dnn/raw/main/models/openvino/2021.1/emotions-recognition-retail-0003.bin wget https://github.com/guoyejun/ffmpeg_dnn/raw/main/models/openvino/2021.1/emotions-recognition-retail-0003.xml wget https://github.com/guoyejun/ffmpeg_dnn/raw/main/models/openvino/2021.1/emotions-recognition-retail-0003.label - run command as: ./ffmpeg -i cici.jpg -vf dnn_detect=dnn_backend=openvino:model=face-detection-adas-0001.xml:input=data:output=detection_out:confidence=0.6:labels=face-detection-adas-0001.label,dnn_classify=dnn_backend=openvino:model=emotions-recognition-retail-0003.xml:input=data:output=prob_emotion:confidence=0.3:labels=emotions-recognition-retail-0003.label:target=face,showinfo -f null - We'll see the detect&classify result as below: [Parsed_showinfo_2 @ 0x55b7d25e77c0] side data - detection bounding boxes: [Parsed_showinfo_2 @ 0x55b7d25e77c0] source: face-detection-adas-0001.xml, emotions-recognition-retail-0003.xml [Parsed_showinfo_2 @ 0x55b7d25e77c0] index: 0, region: (1005, 813) -> (1086, 905), label: face, confidence: 10000/10000. [Parsed_showinfo_2 @ 0x55b7d25e77c0] classify: label: happy, confidence: 6757/10000. [Parsed_showinfo_2 @ 0x55b7d25e77c0] index: 1, region: (888, 839) -> (967, 926), label: face, confidence: 6917/10000. [Parsed_showinfo_2 @ 0x55b7d25e77c0] classify: label: anger, confidence: 4320/10000. Signed-off-by: Guo, Yejun <yejun.guo@intel.com>
author: Guo, Yejun <yejun.guo@intel.com> 2021-03-17 14:08:38 +0800
committer: Guo, Yejun <yejun.guo@intel.com> 2021-05-06 10:50:44 +0800
commit: 41ef57fdb27c9583e61af8eea1ba710314cd86e5 (patch)
tree: 259ac105389a3e40a548fc3f97f756cc1680fcd8 /doc
parent: fc26dca64e0e5d20bb0fcc8743d073cf5b107264 (diff)
1 files changed, 39 insertions, 0 deletions
diff --git a/doc/filters.texi b/doc/filters.texi
index 36e35a175b..b405cc5dfb 100644
--- a/doc/filters.texi
+++ b/doc/filters.texi
@@ -10127,6 +10127,45 @@ ffmpeg -i INPUT -f lavfi -i nullsrc=hd720,geq='r=128+80*(sin(sqrt((X-W/2)*(X-W/2
 @end example
 @end itemize
 
+@section dnn_classify
+
+Do classification with deep neural networks based on bounding boxes.
+
+The filter accepts the following options:
+
+@table @option
+@item dnn_backend
+Specify which DNN backend to use for model loading and execution. This option accepts
+only openvino now, tensorflow backends will be added.
+
+@item model
+Set path to model file specifying network architecture and its parameters.
+Note that different backends use different file formats.
+
+@item input
+Set the input name of the dnn network.
+
+@item output
+Set the output name of the dnn network.
+
+@item confidence
+Set the confidence threshold (default: 0.5).
+
+@item labels
+Set path to label file specifying the mapping between label id and name.
+Each label name is written in one line, tailing spaces and empty lines are skipped.
+The first line is the name of label id 0,
+and the second line is the name of label id 1, etc.
+The label id is considered as name if the label file is not provided.
+
+@item backend_configs
+Set the configs to be passed into backend
+
+For tensorflow backend, you can set its configs with @option{sess_config} options,
+please use tools/python/tf_sess_config.py to get the configs for your system.
+
+@end table
+
 @section dnn_detect
 
 Do object detection with deep neural networks.
author	Guo, Yejun <yejun.guo@intel.com>	2021-03-17 14:08:38 +0800
committer	Guo, Yejun <yejun.guo@intel.com>	2021-05-06 10:50:44 +0800
commit	41ef57fdb27c9583e61af8eea1ba710314cd86e5 (patch)
tree	259ac105389a3e40a548fc3f97f756cc1680fcd8 /doc
parent	fc26dca64e0e5d20bb0fcc8743d073cf5b107264 (diff)