* Determine the multithreading information get_multithreading_operators (TypeExclusive, TypeMutual, TypeReentrant, TypeIndependent) * 自定义函数展开之后,有get_operator_info算子 * get names of all operators of the library get_operator_name ('', OperatorNames) get_operator_info (OperatorNames[Index], 'parallelization', Information)
来自官方例程query_system_parameters.hdev * Parallelization get_system ('processor_num', ProcessorNum) get_system ('thread_pool', ThreadPool) get_system ('thread_num', ThreadNum) *Automatic Operator Parallelization,默认值是true get_system ('parallelize_operators', AOP) *这个修饰符用于把函数定义为可重入函数,默认值是true;所谓可重入函数就是允许被递归调用的函数 get_system ('reentrant', Reentrant) *故意关掉测试性能 *set_system('parallelize_operators','false')
* Determine the parallelization method of all parallelized operators get_parallel_method_operators (SplitTuple, SplitChannel, SplitDomain, SplitPartial, None) AutoParallel := [SplitTuple,SplitChannel,SplitDomain,SplitPartial] AutoParallel := uniq(sort(AutoParallel)) * 自定义函数展开之后,有get_operator_info算子 * get names of all operators of the library get_operator_name ('', OperatorNames) get_operator_info (OperatorNames[Index], 'parallel_method', Information)
5、如果程序员不想使用AOP,而是自己实现并行化,那较为复杂,需要使用多线程技术,把图像进行拆分处理,最后再合并。因此需要更多专业知识,详情参见官方例程simulate_aop.hdev和官方说明书parallel_programming.pdf。
*set_system('parallelize_operators','false')
来自官方例程compute_devices.hdev * This example shows how to use compute devices with HALCON. * dev_update_off () dev_close_window () dev_open_window_fit_size (0, 0, 640, 480, -1, -1, WindowHandle) set_display_font (WindowHandle, 16, 'mono', 'true', 'false') * * Get list of all available compute devices. query_available_compute_devices (DeviceIdentifier) * * End example if no device could be found. if (|DeviceIdentifier| == 0) return () endif * * Display basic information on detected devices. disp_message (WindowHandle, 'Found ' + |DeviceIdentifier| + ' Compute Device(s):', 'window', 12, 12, 'black', 'true') for Index := 0 to |DeviceIdentifier| - 1 by 1 get_compute_device_info (DeviceIdentifier[Index], 'name', DeviceName) get_compute_device_info (DeviceIdentifier[Index], 'vendor', DeviceVendor) Message[Index] := 'Device #' + Index + ': ' + DeviceVendor + ' ' + DeviceName endfor disp_message (WindowHandle, Message, 'window', 42, 12, 'white', 'false') disp_continue_message (WindowHandle, 'black', 'true') stop ()
* Determine all operators that support OpenCL get_opencl_operators (OpenCLSupport) * 自定义函数展开之后,有get_operator_info算子 get_operator_name ('', OperatorNames) get_operator_info (OperatorNames[Index], 'compute_device', Information)
*参考官方例程optimize_aop.hdev;query_aop_info.hdev;simulate_aop.hdev; *举例edges_sub_pix算子性能测试 dev_update_off ()//实现提速的优良效果,必须先关闭设备 dev_close_window () dev_open_window_fit_size (0, 0, 640, 480, -1, -1, WindowHandle) set_display_font (WindowHandle, 16, 'mono', 'true', 'false') get_system ('processor_num', NumCPUs) get_system ('parallelize_operators', AOP) *读取图片 read_image(Image, 'D:/hellowprld/2/1-.jpg') *彩色转灰度图 count_channels (Image, Channels) if (Channels == 3 or Channels == 4) rgb1_to_gray (Image, ImageGray) endif alpha:=5 low:=10 high:=20 *测试1:去掉AOP,即没有加速并行处理 set_system ('parallelize_operators', 'false') get_system ('parallelize_operators', AOP) count_seconds(T0) edges_sub_pix (ImageGray, Edges1, 'canny', alpha, low, high) count_seconds(T1) Time0:=(T1-T0)*1000 stop() *测试2:AOP自动加速并行处理 *Halcon的默认值是开启AOP的,即parallelize_operators值为true set_system ('parallelize_operators', 'true') count_seconds(T1) edges_sub_pix (ImageGray, Edges1, 'canny', alpha, low, high) count_seconds(T2) Time1:=(T2-T1)*1000 stop() *测试3:GPU加速,支持GPU加速的算子Halcon19.11有82个 *GPU加速是先从CPU中将数据拷贝到GPU上处理,处理完成后再将数据从GPU拷贝到CPU上。从CPU到GPU再从GPU到CPU是要花费时间的。 *GPU加速一定会比正常的AOP运算速度快吗?不一定!结果取决于显卡的好坏. query_available_compute_devices(DeviceIdentifiers) DeviceHandle:=0 for i:=0 to |DeviceIdentifiers|-1 by 1 get_compute_device_info(DeviceIdentifiers[i], 'name', Nmae) if (Nmae == 'GeForce GT 630')//根据GPU名称打开GPU open_compute_device(DeviceIdentifiers[i], DeviceHandle) break endif endfor if(DeviceHandle#0) set_compute_device_param (DeviceHandle, 'asynchronous_execution', 'false') init_compute_device(DeviceHandle, 'edges_sub_pix') activate_compute_device(DeviceHandle) endif *获得显卡的信息 get_compute_device_param (DeviceHandle, 'buffer_cache_capacity', GenParamValue0)//默认值是显卡缓存的1/3 get_compute_device_param (DeviceHandle, 'buffer_cache_used', GenParamValue1) get_compute_device_param (DeviceHandle, 'image_cache_capacity', GenParamValue2) get_compute_device_param (DeviceHandle, 'image_cache_used', GenParamValue3) *GenParamValue0 := GenParamValue0 / 3 *set_compute_device_param (DeviceHandle, 'buffer_cache_capacity', GenParamValue0) *get_compute_device_param (DeviceHandle, 'buffer_cache_capacity', GenParamValue4) count_seconds(T3) *如果显卡缓存不够,会报错,error #4104 : Out of compute device memory edges_sub_pix (ImageGray, Edges1, 'canny', alpha, low, high) count_seconds(T4) Time2:=(T4-T3)*1000 if(DeviceHandle#0) deactivate_compute_device(DeviceHandle) endif stop() *测试4:AOP手动优化 set_system ('parallelize_operators', 'true') get_system ('parallelize_operators', AOP) *4.1-优化线程数目方法'threshold' optimize_aop ('edges_sub_pix', 'byte', 'no_file', ['file_mode','model','parameters'], ['nil','threshold','false']) count_seconds(T5) edges_sub_pix (ImageGray, Edges1, 'canny', alpha, low, high) count_seconds(T6) Time3:=(T6-T5)*1000 *4.2-优化线程数目方法'linear' optimize_aop ('edges_sub_pix', 'byte', 'no_file', ['file_mode','model','parameters'], ['nil','linear','false']) count_seconds(T7) edges_sub_pix (ImageGray, Edges1, 'canny', alpha, low, high) count_seconds(T8) Time4:=(T8-T7)*1000 stop() *4.3-优化线程数目方法'mlp' optimize_aop ('edges_sub_pix', 'byte', 'no_file', ['file_mode','model','parameters'], ['nil','mlp','false']) count_seconds(T9) edges_sub_pix (ImageGray, Edges1, 'canny', alpha, low, high) count_seconds(T10) Time5:=(T10-T9)*1000 stop() dev_clear_window() Message := 'edges_sub_pix runtimes:' Message[1] := 'CPU only Time0 without AOP='+Time0+'ms,' Message[2] := 'CPU only Time1 with AOP='+Time1+'ms,' Message[3] := 'GPU use Time2='+Time2+'ms,' Message[4] := 'optimize Time3 threshold='+Time3+'ms' Message[5] := 'optimize Time4 linear='+Time4+'ms' Message[6] := 'optimize Time5 mlp='+Time5+'ms' disp_message (WindowHandle, Message, 'window', 12, 12, 'red', 'false') stop()
edges_sub_pix算子性能测试结果:
rotate_image算子性能测试结果:
本文出自勇哥的网站《少有人走的路》wwww.skcircle.com,转载请注明出处!讨论可扫码加群:


