Loading…

Video-based bird posture recognition using dual feature-rates deep fusion convolutional neural network

[Display omitted] •The first study focuses on multiple behavior recognition of different birds based on the video stream.•The TNL module generates features with various rates and uses a transpose mechanism to enhance features.•DF2-Net transforms features’ rates and iteratively fuses them to obtain t...

Full description

Saved in:
Bibliographic Details
Published in:Ecological indicators 2022-08, Vol.141, p.109141, Article 109141
Main Authors: Lin, Chih-Wei, Chen, Zhongsheng, Lin, Mengxiang
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Items that cite this one
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by cdi_FETCH-LOGICAL-c385t-c02fd3441f264fc68b501cee4f073b946ca4ab336ae758e194c6977689edc1723
cites cdi_FETCH-LOGICAL-c385t-c02fd3441f264fc68b501cee4f073b946ca4ab336ae758e194c6977689edc1723
container_end_page
container_issue
container_start_page 109141
container_title Ecological indicators
container_volume 141
creator Lin, Chih-Wei
Chen, Zhongsheng
Lin, Mengxiang
description [Display omitted] •The first study focuses on multiple behavior recognition of different birds based on the video stream.•The TNL module generates features with various rates and uses a transpose mechanism to enhance features.•DF2-Net transforms features’ rates and iteratively fuses them to obtain the relationship of behavior at various rates.•We collected the new bird postures (behaviors) dataset based on videos that contain eight types of postures. It is necessary to detect changes in birds’ behaviors promptly to realize their health and habitat status. Moreover, promptly provide appropriate medical treatment and environmental remediation. Automating bird behavior recognition can solve this problem and assist in breeding and protecting birds. This paper proposes the transposed non-local module based on a time pyramid network to establish a dual feature-rates deep fusion net (DF2-Net) for bird behavior recognition. The time pyramid network uses spatial alignment and time pooling operations to extract features with different rates from features of different depths and then fuse features containing the information. On this basis, the transposed non-local (TNL) module uses features with different rates to calculate the relationship matrix of each time slice and spatial position. Then TNL uses the transpose operation to make the original feature take advantage of the corresponding relationship when it multiplies the original feature by the matrix. The module further deeply integrates the relational information of behaviors at different rates to enhance features of different rates, respectively, which can improve the recognition effect of the model on dynamic behaviors. Our study has three contributions: (1) The TNL module takes features with different rates as inputs and uses a transpose mechanism to get two relationships in different directions that match their corresponding input. (2) Dual feature-rates deep fusion net (DF2-Net) transforms a single rate feature into two rate features and iteratively fuses them to obtain information and behavior relationships at various rates. (3) We collect the unique video dataset of birds’ behaviors to fill in the blank of the video dataset and support the study of birds’ behaviors. The experiments compare the DF2-Net with the well-known video-based recognition model on the self-collected birds’ behavior dataset, containing eight behaviors. DF2-Net has the best classification accuracy and achieves 80.87%, 81.35%, 80.70%,
doi_str_mv 10.1016/j.ecolind.2022.109141
format article
fullrecord <record><control><sourceid>proquest_doaj_</sourceid><recordid>TN_cdi_doaj_primary_oai_doaj_org_article_47390952deac45cfb7e8f8b9a58b0ea1</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><els_id>S1470160X22006136</els_id><doaj_id>oai_doaj_org_article_47390952deac45cfb7e8f8b9a58b0ea1</doaj_id><sourcerecordid>2718275807</sourcerecordid><originalsourceid>FETCH-LOGICAL-c385t-c02fd3441f264fc68b501cee4f073b946ca4ab336ae758e194c6977689edc1723</originalsourceid><addsrcrecordid>eNqFUU1LxDAQDaKgrv4EoUcvXZM0zcdJRPwCwYuKt5AmkyVrbdakVfz3Zq149TTDzHtvHvMQOiF4STDhZ-sl2NiHwS0pprTMFGFkBx0QKWgtcMN2S88ErgnHL_voMOc1Ljyl-AHyz8FBrDuTwVVdSK7axDxOCapURFdDGEMcqimHYVW5yfSVB7Nd18mMkCsHsKl8WReQjcNH7KctoeAGmNJPGT9jej1Ce970GY5_6wI9XV89Xt7W9w83d5cX97VtZDvWFlPvGsaIp5x5y2XXYmIBmMei6RTj1jDTNQ03IFoJRDHLlRBcKnCWCNos0N2s66JZ600KbyZ96WiC_hnEtNImjcH2oJloFFYtdWAsa63vBEgvO2Va2WEwpGidzlqbFN8nyKN-C9lC35sB4pQ1FUTSYqNYW6B2htoUc07g_04TrLcZ6bX-zUhvM9JzRoV3PvOg_OQjQNLZBhgsuFDePxbT4R-Fb8Bpnvg</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype><pqid>2718275807</pqid></control><display><type>article</type><title>Video-based bird posture recognition using dual feature-rates deep fusion convolutional neural network</title><source>ScienceDirect Freedom Collection</source><creator>Lin, Chih-Wei ; Chen, Zhongsheng ; Lin, Mengxiang</creator><creatorcontrib>Lin, Chih-Wei ; Chen, Zhongsheng ; Lin, Mengxiang</creatorcontrib><description>[Display omitted] •The first study focuses on multiple behavior recognition of different birds based on the video stream.•The TNL module generates features with various rates and uses a transpose mechanism to enhance features.•DF2-Net transforms features’ rates and iteratively fuses them to obtain the relationship of behavior at various rates.•We collected the new bird postures (behaviors) dataset based on videos that contain eight types of postures. It is necessary to detect changes in birds’ behaviors promptly to realize their health and habitat status. Moreover, promptly provide appropriate medical treatment and environmental remediation. Automating bird behavior recognition can solve this problem and assist in breeding and protecting birds. This paper proposes the transposed non-local module based on a time pyramid network to establish a dual feature-rates deep fusion net (DF2-Net) for bird behavior recognition. The time pyramid network uses spatial alignment and time pooling operations to extract features with different rates from features of different depths and then fuse features containing the information. On this basis, the transposed non-local (TNL) module uses features with different rates to calculate the relationship matrix of each time slice and spatial position. Then TNL uses the transpose operation to make the original feature take advantage of the corresponding relationship when it multiplies the original feature by the matrix. The module further deeply integrates the relational information of behaviors at different rates to enhance features of different rates, respectively, which can improve the recognition effect of the model on dynamic behaviors. Our study has three contributions: (1) The TNL module takes features with different rates as inputs and uses a transpose mechanism to get two relationships in different directions that match their corresponding input. (2) Dual feature-rates deep fusion net (DF2-Net) transforms a single rate feature into two rate features and iteratively fuses them to obtain information and behavior relationships at various rates. (3) We collect the unique video dataset of birds’ behaviors to fill in the blank of the video dataset and support the study of birds’ behaviors. The experiments compare the DF2-Net with the well-known video-based recognition model on the self-collected birds’ behavior dataset, containing eight behaviors. DF2-Net has the best classification accuracy and achieves 80.87%, 81.35%, 80.70%, and 81.35% in precision, recall, f1-score, and OA metrics. They are 1.81%, 2.43%, 2.20%, and 2.43% higher than the second-best approach (TPN) with 8 frames, respectively. The experimental results show that DF2-Net outperforms state-of-the-art methods in various frames, in which 16 frames are the most suitable for bird behavior recognition. Moreover, we execute the various ablation experiments to demonstrate the efficiency of the TNL module, the optimal location and internal operations of TNL, and the most suitable parameters of DF2-Net. The ablation experiments demonstrate that the TNL module dramatically improves the recognition accuracy of the dynamic behavior and proves the validity and rationality of the TNL module. Therefore, the proposed model is practical and feasible for automatically recognizing bird behavior.</description><identifier>ISSN: 1470-160X</identifier><identifier>EISSN: 1872-7034</identifier><identifier>DOI: 10.1016/j.ecolind.2022.109141</identifier><language>eng</language><publisher>Elsevier Ltd</publisher><subject>00–01 ; 99–00 ; Behavior rate ; Bird behavior recognition ; birds ; Convolutional neural network ; data collection ; habitats ; Information fusion ; medical treatment ; neural networks ; posture ; remediation</subject><ispartof>Ecological indicators, 2022-08, Vol.141, p.109141, Article 109141</ispartof><rights>2022 The Authors</rights><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><citedby>FETCH-LOGICAL-c385t-c02fd3441f264fc68b501cee4f073b946ca4ab336ae758e194c6977689edc1723</citedby><cites>FETCH-LOGICAL-c385t-c02fd3441f264fc68b501cee4f073b946ca4ab336ae758e194c6977689edc1723</cites></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><link.rule.ids>314,776,780,27901,27902</link.rule.ids></links><search><creatorcontrib>Lin, Chih-Wei</creatorcontrib><creatorcontrib>Chen, Zhongsheng</creatorcontrib><creatorcontrib>Lin, Mengxiang</creatorcontrib><title>Video-based bird posture recognition using dual feature-rates deep fusion convolutional neural network</title><title>Ecological indicators</title><description>[Display omitted] •The first study focuses on multiple behavior recognition of different birds based on the video stream.•The TNL module generates features with various rates and uses a transpose mechanism to enhance features.•DF2-Net transforms features’ rates and iteratively fuses them to obtain the relationship of behavior at various rates.•We collected the new bird postures (behaviors) dataset based on videos that contain eight types of postures. It is necessary to detect changes in birds’ behaviors promptly to realize their health and habitat status. Moreover, promptly provide appropriate medical treatment and environmental remediation. Automating bird behavior recognition can solve this problem and assist in breeding and protecting birds. This paper proposes the transposed non-local module based on a time pyramid network to establish a dual feature-rates deep fusion net (DF2-Net) for bird behavior recognition. The time pyramid network uses spatial alignment and time pooling operations to extract features with different rates from features of different depths and then fuse features containing the information. On this basis, the transposed non-local (TNL) module uses features with different rates to calculate the relationship matrix of each time slice and spatial position. Then TNL uses the transpose operation to make the original feature take advantage of the corresponding relationship when it multiplies the original feature by the matrix. The module further deeply integrates the relational information of behaviors at different rates to enhance features of different rates, respectively, which can improve the recognition effect of the model on dynamic behaviors. Our study has three contributions: (1) The TNL module takes features with different rates as inputs and uses a transpose mechanism to get two relationships in different directions that match their corresponding input. (2) Dual feature-rates deep fusion net (DF2-Net) transforms a single rate feature into two rate features and iteratively fuses them to obtain information and behavior relationships at various rates. (3) We collect the unique video dataset of birds’ behaviors to fill in the blank of the video dataset and support the study of birds’ behaviors. The experiments compare the DF2-Net with the well-known video-based recognition model on the self-collected birds’ behavior dataset, containing eight behaviors. DF2-Net has the best classification accuracy and achieves 80.87%, 81.35%, 80.70%, and 81.35% in precision, recall, f1-score, and OA metrics. They are 1.81%, 2.43%, 2.20%, and 2.43% higher than the second-best approach (TPN) with 8 frames, respectively. The experimental results show that DF2-Net outperforms state-of-the-art methods in various frames, in which 16 frames are the most suitable for bird behavior recognition. Moreover, we execute the various ablation experiments to demonstrate the efficiency of the TNL module, the optimal location and internal operations of TNL, and the most suitable parameters of DF2-Net. The ablation experiments demonstrate that the TNL module dramatically improves the recognition accuracy of the dynamic behavior and proves the validity and rationality of the TNL module. Therefore, the proposed model is practical and feasible for automatically recognizing bird behavior.</description><subject>00–01</subject><subject>99–00</subject><subject>Behavior rate</subject><subject>Bird behavior recognition</subject><subject>birds</subject><subject>Convolutional neural network</subject><subject>data collection</subject><subject>habitats</subject><subject>Information fusion</subject><subject>medical treatment</subject><subject>neural networks</subject><subject>posture</subject><subject>remediation</subject><issn>1470-160X</issn><issn>1872-7034</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2022</creationdate><recordtype>article</recordtype><sourceid>DOA</sourceid><recordid>eNqFUU1LxDAQDaKgrv4EoUcvXZM0zcdJRPwCwYuKt5AmkyVrbdakVfz3Zq149TTDzHtvHvMQOiF4STDhZ-sl2NiHwS0pprTMFGFkBx0QKWgtcMN2S88ErgnHL_voMOc1Ljyl-AHyz8FBrDuTwVVdSK7axDxOCapURFdDGEMcqimHYVW5yfSVB7Nd18mMkCsHsKl8WReQjcNH7KctoeAGmNJPGT9jej1Ce970GY5_6wI9XV89Xt7W9w83d5cX97VtZDvWFlPvGsaIp5x5y2XXYmIBmMei6RTj1jDTNQ03IFoJRDHLlRBcKnCWCNos0N2s66JZ600KbyZ96WiC_hnEtNImjcH2oJloFFYtdWAsa63vBEgvO2Va2WEwpGidzlqbFN8nyKN-C9lC35sB4pQ1FUTSYqNYW6B2htoUc07g_04TrLcZ6bX-zUhvM9JzRoV3PvOg_OQjQNLZBhgsuFDePxbT4R-Fb8Bpnvg</recordid><startdate>202208</startdate><enddate>202208</enddate><creator>Lin, Chih-Wei</creator><creator>Chen, Zhongsheng</creator><creator>Lin, Mengxiang</creator><general>Elsevier Ltd</general><general>Elsevier</general><scope>6I.</scope><scope>AAFTH</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>7S9</scope><scope>L.6</scope><scope>DOA</scope></search><sort><creationdate>202208</creationdate><title>Video-based bird posture recognition using dual feature-rates deep fusion convolutional neural network</title><author>Lin, Chih-Wei ; Chen, Zhongsheng ; Lin, Mengxiang</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c385t-c02fd3441f264fc68b501cee4f073b946ca4ab336ae758e194c6977689edc1723</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2022</creationdate><topic>00–01</topic><topic>99–00</topic><topic>Behavior rate</topic><topic>Bird behavior recognition</topic><topic>birds</topic><topic>Convolutional neural network</topic><topic>data collection</topic><topic>habitats</topic><topic>Information fusion</topic><topic>medical treatment</topic><topic>neural networks</topic><topic>posture</topic><topic>remediation</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Lin, Chih-Wei</creatorcontrib><creatorcontrib>Chen, Zhongsheng</creatorcontrib><creatorcontrib>Lin, Mengxiang</creatorcontrib><collection>ScienceDirect Open Access Titles</collection><collection>Elsevier:ScienceDirect:Open Access</collection><collection>CrossRef</collection><collection>AGRICOLA</collection><collection>AGRICOLA - Academic</collection><collection>DOAJ: Directory of Open Access Journals</collection><jtitle>Ecological indicators</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Lin, Chih-Wei</au><au>Chen, Zhongsheng</au><au>Lin, Mengxiang</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Video-based bird posture recognition using dual feature-rates deep fusion convolutional neural network</atitle><jtitle>Ecological indicators</jtitle><date>2022-08</date><risdate>2022</risdate><volume>141</volume><spage>109141</spage><pages>109141-</pages><artnum>109141</artnum><issn>1470-160X</issn><eissn>1872-7034</eissn><abstract>[Display omitted] •The first study focuses on multiple behavior recognition of different birds based on the video stream.•The TNL module generates features with various rates and uses a transpose mechanism to enhance features.•DF2-Net transforms features’ rates and iteratively fuses them to obtain the relationship of behavior at various rates.•We collected the new bird postures (behaviors) dataset based on videos that contain eight types of postures. It is necessary to detect changes in birds’ behaviors promptly to realize their health and habitat status. Moreover, promptly provide appropriate medical treatment and environmental remediation. Automating bird behavior recognition can solve this problem and assist in breeding and protecting birds. This paper proposes the transposed non-local module based on a time pyramid network to establish a dual feature-rates deep fusion net (DF2-Net) for bird behavior recognition. The time pyramid network uses spatial alignment and time pooling operations to extract features with different rates from features of different depths and then fuse features containing the information. On this basis, the transposed non-local (TNL) module uses features with different rates to calculate the relationship matrix of each time slice and spatial position. Then TNL uses the transpose operation to make the original feature take advantage of the corresponding relationship when it multiplies the original feature by the matrix. The module further deeply integrates the relational information of behaviors at different rates to enhance features of different rates, respectively, which can improve the recognition effect of the model on dynamic behaviors. Our study has three contributions: (1) The TNL module takes features with different rates as inputs and uses a transpose mechanism to get two relationships in different directions that match their corresponding input. (2) Dual feature-rates deep fusion net (DF2-Net) transforms a single rate feature into two rate features and iteratively fuses them to obtain information and behavior relationships at various rates. (3) We collect the unique video dataset of birds’ behaviors to fill in the blank of the video dataset and support the study of birds’ behaviors. The experiments compare the DF2-Net with the well-known video-based recognition model on the self-collected birds’ behavior dataset, containing eight behaviors. DF2-Net has the best classification accuracy and achieves 80.87%, 81.35%, 80.70%, and 81.35% in precision, recall, f1-score, and OA metrics. They are 1.81%, 2.43%, 2.20%, and 2.43% higher than the second-best approach (TPN) with 8 frames, respectively. The experimental results show that DF2-Net outperforms state-of-the-art methods in various frames, in which 16 frames are the most suitable for bird behavior recognition. Moreover, we execute the various ablation experiments to demonstrate the efficiency of the TNL module, the optimal location and internal operations of TNL, and the most suitable parameters of DF2-Net. The ablation experiments demonstrate that the TNL module dramatically improves the recognition accuracy of the dynamic behavior and proves the validity and rationality of the TNL module. Therefore, the proposed model is practical and feasible for automatically recognizing bird behavior.</abstract><pub>Elsevier Ltd</pub><doi>10.1016/j.ecolind.2022.109141</doi><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 1470-160X
ispartof Ecological indicators, 2022-08, Vol.141, p.109141, Article 109141
issn 1470-160X
1872-7034
language eng
recordid cdi_doaj_primary_oai_doaj_org_article_47390952deac45cfb7e8f8b9a58b0ea1
source ScienceDirect Freedom Collection
subjects 00–01
99–00
Behavior rate
Bird behavior recognition
birds
Convolutional neural network
data collection
habitats
Information fusion
medical treatment
neural networks
posture
remediation
title Video-based bird posture recognition using dual feature-rates deep fusion convolutional neural network
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-02-04T14%3A10%3A30IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-proquest_doaj_&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Video-based%20bird%20posture%20recognition%20using%20dual%20feature-rates%20deep%20fusion%20convolutional%20neural%20network&rft.jtitle=Ecological%20indicators&rft.au=Lin,%20Chih-Wei&rft.date=2022-08&rft.volume=141&rft.spage=109141&rft.pages=109141-&rft.artnum=109141&rft.issn=1470-160X&rft.eissn=1872-7034&rft_id=info:doi/10.1016/j.ecolind.2022.109141&rft_dat=%3Cproquest_doaj_%3E2718275807%3C/proquest_doaj_%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c385t-c02fd3441f264fc68b501cee4f073b946ca4ab336ae758e194c6977689edc1723%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_pqid=2718275807&rft_id=info:pmid/&rfr_iscdi=true