| Commit message (Collapse) | Author | Age |
... | |
|
|
|
|
|
|
|
| |
the default number of batch_size is 1
Signed-off-by: Xie, Lin <lin.xie@intel.com>
Signed-off-by: Wu Zhiwen <zhiwen.wu@intel.com>
Signed-off-by: Guo, Yejun <yejun.guo@intel.com>
|
|
|
|
|
| |
Reviewed-by: Lynne <dev@lynne.ee>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
|
|
|
|
| |
Signed-off-by: Guo, Yejun <yejun.guo@intel.com>
|
|
|
|
| |
Signed-off-by: Guo, Yejun <yejun.guo@intel.com>
|
|
|
|
|
|
|
| |
The prefix for symbols not exported from the library and not
local to one translation unit is ff_ (or FF for types).
Signed-off-by: Guo, Yejun <yejun.guo@intel.com>
|
|
|
|
|
|
|
|
|
|
| |
'void *' is too flexible, since we can derive info from
AVFilterContext*, so we just unify the interface with this data
structure.
Signed-off-by: Xie, Lin <lin.xie@intel.com>
Signed-off-by: Wu Zhiwen <zhiwen.wu@intel.com>
Signed-off-by: Guo, Yejun <yejun.guo@intel.com>
|
|
|
|
|
|
| |
Signed-off-by: Xie, Lin <lin.xie@intel.com>
Signed-off-by: Wu Zhiwen <zhiwen.wu@intel.com>
Signed-off-by: Guo, Yejun <yejun.guo@intel.com>
|
|
|
|
|
|
| |
Signed-off-by: Xie, Lin <lin.xie@intel.com>
Signed-off-by: Wu Zhiwen <zhiwen.wu@intel.com>
Signed-off-by: Guo, Yejun <yejun.guo@intel.com>
|
|
|
|
|
|
| |
Signed-off-by: Xie, Lin <lin.xie@intel.com>
Signed-off-by: Wu Zhiwen <zhiwen.wu@intel.com>
Signed-off-by: Guo, Yejun <yejun.guo@intel.com>
|
|
|
|
|
|
|
|
|
| |
function fill_model_input_ov and infer_completion_callback are
extracted, it will help the async execution for reuse.
Signed-off-by: Xie, Lin <lin.xie@intel.com>
Signed-off-by: Wu Zhiwen <zhiwen.wu@intel.com>
Signed-off-by: Guo, Yejun <yejun.guo@intel.com>
|
|
|
|
|
|
| |
Signed-off-by: Xie, Lin <lin.xie@intel.com>
Signed-off-by: Wu Zhiwen <zhiwen.wu@intel.com>
Signed-off-by: Guo, Yejun <yejun.guo@intel.com>
|
|
|
|
|
| |
Signed-off-by: Ting Fu <ting.fu@intel.com>
Signed-off-by: Guo, Yejun <yejun.guo@intel.com>
|
|
|
|
|
|
|
| |
sws_getContext may be return NULL, and it's will be dereferenced,
so add the check.
Signed-off-by: Jun Zhao <barryjzhao@tencent.com>
|
|
|
|
|
|
| |
Used the format name in debug message.
Signed-off-by: Jun Zhao <barryjzhao@tencent.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
TensorFlow C library accepts config for session options to
set different parameters for the inference. This patch exports
this interface.
The config is a serialized tensorflow.ConfigProto proto, so we need
two steps to use it:
1. generate the serialized proto with python (see script example below)
the output looks like: 0xab...cd
where 0xcd is the least significant byte and 0xab is the most significant byte.
2. pass the python script output into ffmpeg with
dnn_processing=options=sess_config=0xab...cd
The following script is an example to specify one GPU. If the system contains
3 GPU cards, the visible_device_list could be '0', '1', '2', '0,1' etc.
'0' does not mean physical GPU card 0, we need to try and see.
And we can also add more opitions here to generate more serialized proto.
script example to generate serialized proto which specifies one GPU:
import tensorflow as tf
gpu_options = tf.GPUOptions(visible_device_list='0')
config = tf.ConfigProto(gpu_options=gpu_options)
s = config.SerializeToString()
b = ''.join("%02x" % int(ord(b)) for b in s[::-1])
print('0x%s' % b)
|
|
|
|
|
| |
These previously would not check that the return value was non-null
meaning it was susceptible to a sigsegv. This checks those values.
|
|
|
|
| |
check that frame allocations return non-null.
|
|
|
|
| |
Signed-off-by: Mingyu Yin <mingyu.yin@intel.com>
|
|
|
|
|
|
|
|
|
|
| |
for some cases (for example, super resolution), the DNN model changes
the frame size which impacts the filter behavior, so the filter needs
to know the out frame size at very beginning.
Currently, the filter reuses DNNModule.execute_model to query the
out frame size, it is not clear from interface perspective, so add
a new explict interface DNNModel.get_output for such query.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
suppose we have a detect and classify filter in the future, the
detect filter generates some bounding boxes (BBox) as AVFrame sidedata,
and the classify filter executes DNN model for each BBox. For each
BBox, we need to crop the AVFrame, copy data to DNN model input and do
the model execution. So we have to save the in_frame at DNNModel.set_input
and use it at DNNModule.execute_model, such saving is not feasible
when we support async execute_model.
This patch sets the in_frame as execution_model parameter, and so
all the information are put together within the same function for
each inference. It also makes easy to support BBox async inference.
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, every filter needs to provide code to transfer data from
AVFrame* to model input (DNNData*), and also from model output
(DNNData*) to AVFrame*. Actually, such transfer can be implemented
within DNN module, and so filter can focus on its own business logic.
DNN module also exports the function pointer pre_proc and post_proc
in struct DNNModel, just in case that a filter has its special logic
to transfer data between AVFrame* and DNNData*. The default implementation
within DNN module is used if the filter does not set pre/post_proc.
|
|
|
|
| |
the userdata will be used for the interaction between AVFrame and DNNData
|
|
|
|
|
|
|
|
|
|
|
| |
mode.
Before patch, fate test for dnn may fail in some Windows environment
while succeed in my Linux. The bug was caused by a wrong loop boundary.
After patch, fate test succeed in my windows mingw 64-bit.
Signed-off-by: Xu Jun <xujunzz@sjtu.edu.cn>
Signed-off-by: Guo, Yejun <yejun.guo@intel.com>
|
|
|
|
|
|
|
| |
Move thread area allocate out of thread function into
main thread.
Signed-off-by: Xu Jun <xujunzz@sjtu.edu.cn>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
function.
Before patch, memory was allocated in each thread functions,
which may cause more than one time of memory allocation and
cause crash.
After patch, memory is allocated in the main thread once,
an index was parsed into thread functions. Bug fixed.
Signed-off-by: Xu Jun <xujunzz@sjtu.edu.cn>
|
|
|
|
|
|
|
| |
show all input/output names when the input or output name not correct
Signed-off-by: Ting Fu <ting.fu@intel.com>
Signed-off-by: Guo, Yejun <yejun.guo@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
for enabling OpenVINO GPU please:
1. install required OpenCL drivers, see: https://github.com/intel/compute-runtime/releases/tag/19.41.14441
2. build OpenVINO c lib with GPU enabled: use cmake config with: -DENABLE_CLDNN=ON
3. then make, and include the OpenVINO c lib in environment variables
detailed steps please refer: https://github.com/openvinotoolkit/openvino/blob/master/build-instruction.md
inference model with GPU please add: optioins=device=GPU
Signed-off-by: Ting Fu <ting.fu@intel.com>
Signed-off-by: Guo, Yejun <yejun.guo@intel.com>
|
|
|
|
|
|
|
| |
Found via ASAN with the dnn-layer-conv2d FATE-test.
Reviewed-by: Guo, Yejun <yejun.guo@intel.com>
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Use pthread to multithread dnn_execute_layer_conv2d.
Can be tested with command "./ffmpeg_g -i input.png -vf \
format=yuvj420p,dnn_processing=dnn_backend=native:model= \
espcn.model:input=x:output=y:options=conv2d_threads=23 \
-y sr_native.jpg -benchmark"
before patch: utime=11.238s stime=0.005s rtime=11.248s
after patch: utime=20.817s stime=0.047s rtime=1.051s
on my 3900X 12c24t @4.2GHz
About the increase of utime, it's because that CPU HyperThreading
technology makes logical cores twice of physical cores while cpu's
counting performance improves less than double. And utime sums
all cpu's logical cores' runtime. As a result, using threads num
near cpu's logical core's number will double utime, while reduce
rtime less than half for HyperThreading CPUs.
Signed-off-by: Xu Jun <xujunzz@sjtu.edu.cn>
Signed-off-by: Guo, Yejun <yejun.guo@intel.com>
|
|
|
|
|
| |
Signed-off-by: Xu Jun <xujunzz@sjtu.edu.cn>
Signed-off-by: Guo, Yejun <yejun.guo@intel.com>
|
|
|
|
| |
Signed-off-by: Ting Fu <ting.fu@intel.com>
|
|
|
|
| |
Signed-off-by: Ting Fu <ting.fu@intel.com>
|
|
|
|
| |
Signed-off-by: Ting Fu <ting.fu@intel.com>
|
|
|
|
|
|
|
| |
Unify all error return as DNN_ERROR, in order to cease model executing
when return error in ff_dnn_execute_model_native layer_func.pf_exec
Signed-off-by: Ting Fu <ting.fu@intel.com>
|
|
|
|
|
|
|
|
|
|
|
| |
currently, output is set both at DNNModel.set_input_output and
DNNModule.execute_model, it makes sense that the output name is
provided at model inference time so all the output info is set
at a single place.
and so DNNModel.set_input_output is renamed to DNNModel.set_input
Signed-off-by: Guo, Yejun <yejun.guo@intel.com>
|
|
|
|
| |
Signed-off-by: Mingyu Yin <mingyu.yin@intel.com>
|
|
|
|
| |
Signed-off-by: Mingyu Yin <mingyu.yin@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
dnn_execute_layer_avg_pool() contains the following line:
assert(avgpool_params->padding_method = VALID);
This statement contains an assignment where obviously a comparison was
intended. Furthermore, *avgpool_params is const, so that the attempted
assignment leads to a compilation failure if asserts are enabled
(i.e. if DEBUG is defined which leads libavutil/internal.h to not define
NDEBUG). Moreover, the enumeration constant VALID actually has the value 0,
so that the assert would be triggered if a compiler compiles this with
asserts enabled. Finally, the statement uses assert() directly instead
of av_assert*().
All these errors have been fixed.
Thanks to ubitux for providing a FATE-box [1] where DEBUG is defined.
[1]: http://fate.ffmpeg.org/history.cgi?slot=x86_64-archlinux-gcc-ddebug
Signed-off-by: Andreas Rheinhardt <andreas.rheinhardt@gmail.com>
Reviewed-by: Guo, Yejun <yejun.guo@intel.com>
|
|
|
|
|
| |
Signed-off-by: Ting Fu <ting.fu@intel.com>
Reviewed-by: Guo, Yejun <yejun.guo@intel.com>
|
|
|
|
| |
Signed-off-by: Guo, Yejun <yejun.guo@intel.com>
|
|
|
|
|
|
|
| |
different backend might need different options for a better performance,
so, add the parameter into dnn interface, as a preparation.
Signed-off-by: Guo, Yejun <yejun.guo@intel.com>
|
|
|
|
|
| |
Signed-off-by: Mingyu Yin <mingyu.yin@intel.com>
Reviewed-by: Guo, Yejun <yejun.guo@intel.com>
|
|
|
|
|
|
|
| |
Not support pooling strides in channel dimension yet.
Signed-off-by: Ting Fu <ting.fu@intel.com>
Reviewed-by: Guo, Yejun <yejun.guo@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It can be tested with the model generated with below python script:
import tensorflow as tf
import os
import numpy as np
import imageio
from tensorflow.python.framework import graph_util
name = 'floor'
pb_file_path = os.getcwd()
if not os.path.exists(pb_file_path+'/{}_savemodel/'.format(name)):
os.mkdir(pb_file_path+'/{}_savemodel/'.format(name))
with tf.Session(graph=tf.Graph()) as sess:
in_img = imageio.imread('detection.jpg')
in_img = in_img.astype(np.float32)
in_data = in_img[np.newaxis, :]
input_x = tf.placeholder(tf.float32, shape=[1, None, None, 3], name='dnn_in')
y_ = tf.math.floor(input_x*255)/255
y = tf.identity(y_, name='dnn_out')
sess.run(tf.global_variables_initializer())
constant_graph = graph_util.convert_variables_to_constants(sess, sess.graph_def, ['dnn_out'])
with tf.gfile.FastGFile(pb_file_path+'/{}_savemodel/model.pb'.format(name), mode='wb') as f:
f.write(constant_graph.SerializeToString())
print("model.pb generated, please in ffmpeg path use\n \n \
python tools/python/convert.py {}_savemodel/model.pb --outdir={}_savemodel/ \n \nto generate model.model\n".format(name,name))
output = sess.run(y, feed_dict={ input_x: in_data})
imageio.imsave("out.jpg", np.squeeze(output))
print("To verify, please ffmpeg path use\n \n \
./ffmpeg -i detection.jpg -vf format=rgb24,dnn_processing=model={}_savemodel/model.pb:input=dnn_in:output=dnn_out:dnn_backend=tensorflow -f framemd5 {}_savemodel/tensorflow_out.md5\n \
or\n \
./ffmpeg -i detection.jpg -vf format=rgb24,dnn_processing=model={}_savemodel/model.pb:input=dnn_in:output=dnn_out:dnn_backend=tensorflow {}_savemodel/out_tensorflow.jpg\n \nto generate output result of tensorflow model\n".format(name, name, name, name))
print("To verify, please ffmpeg path use\n \n \
./ffmpeg -i detection.jpg -vf format=rgb24,dnn_processing=model={}_savemodel/model.model:input=dnn_in:output=dnn_out:dnn_backend=native -f framemd5 {}_savemodel/native_out.md5\n \
or \n \
./ffmpeg -i detection.jpg -vf format=rgb24,dnn_processing=model={}_savemodel/model.model:input=dnn_in:output=dnn_out:dnn_backend=native {}_savemodel/out_native.jpg\n \nto generate output result of native model\n".format(name, name, name, name))
Signed-off-by: Mingyu Yin <mingyu.yin@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It can be tested with the model generated with below python script:
import tensorflow as tf
import os
import numpy as np
import imageio
from tensorflow.python.framework import graph_util
name = 'ceil'
pb_file_path = os.getcwd()
if not os.path.exists(pb_file_path+'/{}_savemodel/'.format(name)):
os.mkdir(pb_file_path+'/{}_savemodel/'.format(name))
with tf.Session(graph=tf.Graph()) as sess:
in_img = imageio.imread('detection.jpg')
in_img = in_img.astype(np.float32)
in_data = in_img[np.newaxis, :]
input_x = tf.placeholder(tf.float32, shape=[1, None, None, 3], name='dnn_in')
y = tf.math.ceil( input_x, name='dnn_out')
sess.run(tf.global_variables_initializer())
constant_graph = graph_util.convert_variables_to_constants(sess, sess.graph_def, ['dnn_out'])
with tf.gfile.FastGFile(pb_file_path+'/{}_savemodel/model.pb'.format(name), mode='wb') as f:
f.write(constant_graph.SerializeToString())
print("model.pb generated, please in ffmpeg path use\n \n \
python tools/python/convert.py ceil_savemodel/model.pb --outdir=ceil_savemodel/ \n \n \
to generate model.model\n")
output = sess.run(y, feed_dict={ input_x: in_data})
imageio.imsave("out.jpg", np.squeeze(output))
print("To verify, please ffmpeg path use\n \n \
./ffmpeg -i detection.jpg -vf format=rgb24,dnn_processing=model=ceil_savemodel/model.pb:input=dnn_in:output=dnn_out:dnn_backend=tensorflow -f framemd5 ceil_savemodel/tensorflow_out.md5\n \n \
to generate output result of tensorflow model\n")
print("To verify, please ffmpeg path use\n \n \
./ffmpeg -i detection.jpg -vf format=rgb24,dnn_processing=model=ceil_savemodel/model.model:input=dnn_in:output=dnn_out:dnn_backend=native -f framemd5 ceil_savemodel/native_out.md5\n \n \
to generate output result of native model\n")
Signed-off-by: Mingyu Yin <mingyu.yin@intel.com>
Reviewed-by: Guo, Yejun <yejun.guo@intel.com>
|
|
|
|
|
|
|
|
|
| |
We should not silently allocate an incorrect sized buffer.
Fixes trac issue #8718.
Signed-off-by: Reimar Döffinger <Reimar.Doeffinger@gmx.de>
Reviewed-by: Michael Niedermayer <michael@niedermayer.cc>
Reviewed-by: Guo, Yejun <yejun.guo@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It can be tested with the model generated with below python script:
import tensorflow as tf
import numpy as np
import imageio
in_img = imageio.imread('input.jpeg')
in_img = in_img.astype(np.float32)/255.0
in_data = in_img[np.newaxis, :]
x = tf.placeholder(tf.float32, shape=[1, None, None, 3], name='dnn_in')
please uncomment the part you want to test
x_sinh_1 = tf.sinh(x)
x_out = tf.divide(x_sinh_1, 1.176) # sinh(1.0)
x_cosh_1 = tf.cosh(x)
x_out = tf.divide(x_cosh_1, 1.55) # cosh(1.0)
x_tanh_1 = tf.tanh(x)
x__out = tf.divide(x_tanh_1, 0.77) # tanh(1.0)
x_asinh_1 = tf.asinh(x)
x_out = tf.divide(x_asinh_1, 0.89) # asinh(1.0/1.1)
x_acosh_1 = tf.add(x, 1.1)
x_acosh_2 = tf.acosh(x_acosh_1) # accept (1, inf)
x_out = tf.divide(x_acosh_2, 1.4) # acosh(2.1)
x_atanh_1 = tf.divide(x, 1.1)
x_atanh_2 = tf.atanh(x_atanh_1) # accept (-1, 1)
x_out = tf.divide(x_atanh_2, 1.55) # atanhh(1.0/1.1)
y = tf.identity(x_out, name='dnn_out') #please only preserve the x_out you want to test
sess=tf.Session()
sess.run(tf.global_variables_initializer())
graph_def = tf.graph_util.convert_variables_to_constants(sess, sess.graph_def, ['dnn_out'])
tf.train.write_graph(graph_def, '.', 'image_process.pb', as_text=False)
print("image_process.pb generated, please use \
path_to_ffmpeg/tools/python/convert.py to generate image_process.model\n")
output = sess.run(y, feed_dict={x: in_data})
imageio.imsave("out.jpg", np.squeeze(output))
Signed-off-by: Ting Fu <ting.fu@intel.com>
|
|
|
|
| |
Signed-off-by: Ting Fu <ting.fu@intel.com>
|
|
|
|
| |
Signed-off-by: Ting Fu <ting.fu@intel.com>
|
|
|
|
| |
Signed-off-by: Ting Fu <ting.fu@intel.com>
|