IPC분류정보
국가/구분 |
United States(US) Patent
등록
|
국제특허분류(IPC7판) |
|
출원번호 |
UP-0963306
(2004-10-12)
|
등록번호 |
US-7715934
(2010-06-03)
|
발명자
/ 주소 |
- Bland, William
- Moore, James Edward
|
출원인 / 주소 |
|
대리인 / 주소 |
|
인용정보 |
피인용 횟수 :
5 인용 특허 :
37 |
초록
▼
An input profile is generated from an input audio file using a measurable attribute that was also used to generate reference profiles from reference audio files. The input profile is then subjected to a process that was also used to generate a reference profiles tree, which is structured as a sparse
An input profile is generated from an input audio file using a measurable attribute that was also used to generate reference profiles from reference audio files. The input profile is then subjected to a process that was also used to generate a reference profiles tree, which is structured as a sparse binary tree, from the reference profiles. As a result of the process, information of reference profiles having similar characteristics as the input profile, with respect to the measurable attribute, are retrieved from resulting nodes of the reference profiles tree. The input profile is then compared with this subset of the reference profiles, representing potential matches, to determine that either it matches one of the reference profiles, or that it is a spoof, or that it does not match any of the reference profiles.
대표청구항
▼
We claim: 1. A method for matching an input audio file with a plurality of reference audio files, comprising: generating an input profile by segmenting the input audio file into chunks and determining a value for a characteristic attribute of each of the chunks; identifying chunks of the input prof
We claim: 1. A method for matching an input audio file with a plurality of reference audio files, comprising: generating an input profile by segmenting the input audio file into chunks and determining a value for a characteristic attribute of each of the chunks; identifying chunks of the input profile whose characteristic attribute values satisfy a criterion; determining nodes of a sparse binary tree that are associated with individual of the plurality of reference audio files to identify potential matches tor the input audio file by processing, for and only for each chunk of the input profile whose characteristic attribute value satisfies the criterion, all chunks from the characteristic attribute value satisfying chunk to a last chunk of the input profile so as to move down left and right branch child nodes of the sparse binary tree starting from a root node wherein the determination of whether to move down the left or right branch child node for each chunk being processed depends upon whether the chunk being processed has a characteristic attribute value greater than a specified value; and searching for a match of the input audio file among the potential matches. 2. The method according to claim 1, further comprising: generating, prior to identifying potential matches for the input audio file, a plurality of reference profiles from corresponding ones of the plurality of reference audio files by segmenting each reference audio file into chunks and determining a value for the characteristic attribute for each of the chunks. 3. The method according to claim 2, wherein the plurality of reference profiles are associated with nodes of the sparse binary tree by identifying chunks of the plurality of reference profiles whose characteristic attribute values satisfy the criterion and processing, for and only for each chunk whose characteristic attribute value satisfies the criterion, all chunks from the characteristic attribute value satisfying chunk to a last chunk of its reference profile down left and right branch child nodes of the sparse binary tree starting with the root node wherein the determination of whether to move down the left or right branch child node for each chunk being processed depends upon whether the chunk being processed has a characteristic attribute value greater than the specified value so that upon completion of such processing, a profile hook identifying the reference audio file of the chunk being processed is stored at a current node upon completion of the processing for the characteristic attribute value satisfying chunk. 4. The method according to claim 3, wherein individual chunks of the input audio file includes information of digitized samples of an audio clip over a period of time and the characteristic attribute is a number of zero crossings of the digitized samples in the chunk. 5. The method according to claim 4, wherein the criterion is satisfied if the zero crossing count of a chunk is a local maximum. 6. The method according to claim 3, wherein the identification of chunks whose characteristic attribute values satisfy the criterion is performed on an ever increasing sampling basis. 7. The method according to claim 6, wherein the ever increasing sampling basis is a quadratically increasing sample basis. 8. The method according to claim 6, wherein the ever increasing sampling basis is an exponentially increasing sample basis. 9. The method according to claim 6, wherein the identification of chunks whose characteristic attribute values satisfy the criterion is performed by incrementing through the chunks at a specified velocity and acceleration. 10. The method according to claim 9, wherein the velocity is the number of chunks between local maxima in the input profile and the acceleration is the change in velocity divided by the number of chunks over which the change occurs. 11. The method according to claim 3, wherein the identification of potential matches for the input audio file with the plurality of reference profiles comprises identifying potential mini-matches by retrieving profile hooks associated with nodes in the sparse binary tree. 12. The method according to claim 11, wherein the identification of mini-matches among the plurality of reference profiles further comprises for individual reference profiles corresponding to the retrieved profile hooks: comparing a number of chunks of the input profile and corresponding chunks of the reference profile; and identifying a mini-match if corresponding chunks of the reference profile substantially matches those of the input profile. 13. The method according to claim 12, wherein the identification of mini-matches among the plurality of reference profiles further comprises: identifying a non-full mini-match using a best matching one of the reference profiles with the input profile if none of the reference profiles identified by the profile hooks substantially matches those of the input profile. 14. The method according to claim 13, wherein the identification of potential matches further comprises: merging any mini-matches and non-full mini-matches corresponding to the same reference profile and having an offset into the input profile at which the reference profile begins within a specified tolerance. 15. The method according to claim 14, wherein if a mini-match is merged with a non-full mini-match, then the merged entity is referred to as a mini-match. 16. The method according to claim 15, further comprising: identifying the input audio file as a spoof if all mini-matches identified for the input profile do not refer to the same reference profile. 17. The method according to claim 16, wherein the input audio file is not identified as a spoof if the sum of the total audio time covered by the mini-matches is less than a specified threshold value. 18. The method according to claim 17, wherein the input audio file is not identified as a spoof even if the sum of the total audio time covered by the mini-matches is not less than the specified threshold value or if any of the mini-matches has an associated error per second value that is greater than a first specified maximum value. 19. The method according to claim 18, wherein the searching for the match results in a best match being found if a percentage of the input profile and the reference profile covered by the best match exceeds some minimum value after ignoring all non-full mini-matches, ignoring mini-matches having an error per second value that is greater than a second specified maximum value, and taking into consideration any other programmed criteria. 20. An apparatus for matching an input audio file with a plurality of reference audio files, comprising at least one computer configured to: generate an input profile by segmenting the input audio file into chunks and determining a value for a characteristic attribute of each of the chunks; identify chunks of the input profile whose characteristic attribute values satisfy a criterion; determine nodes of a sparse binary tree that are associated with individual of the plurality of reference audio files to identify potential matches for the input audio file by processing, for and only for each chunk of the input profile whose characteristic attribute value satisfies the criterion, all chunks from the characteristic attribute value satisfying chunk to a last chunk of the input profile so as to move down left and right branch child nodes of the sparse binary tree starting from a root node wherein the determination of whether to move down the left or right branch child node for each chunk being processed depends upon whether the chunk being processed has a characteristic attribute value greater than a specified value; and search for a match of the input audio file among the potential matches. 21. The apparatus according to claim 20, wherein the at least one computer is further configured to: generate, prior to identifying potential matches for the input audio file, a plurality of reference profiles from corresponding ones of the plurality of reference audio files by segmenting each reference audio file into chunks and determining a value for the characteristic attribute for each of the chunks. 22. The apparatus according to claim 21, wherein the at least one computer is configured to generate the plurality of reference profiles so as to be associated with nodes of the sparse binary tree by identifying chunks of the plurality of reference profiles whose characteristic attribute values satisfy the criterion and processing, for and only for each chunk whose characteristic attribute value satisfies the criterion, all chunks from the characteristic attribute value satisfying chunk to a last chunk of its reference profile down left and right branch child nodes of the sparse binary tree starting with the root node wherein the determination of whether to move down the left or right branch child node for each chunk being processed depends upon whether the chunk being processed has a characteristic attribute value greater than the specified value so that upon completion of such processing, a profile hook identifying the reference audio file of the chunk being processed is stored at a current node upon completion of the processing for the characteristic attribute value satisfying chunk. 23. The apparatus according to claim 22, wherein the at least one computer is configured to generate individual of the chunks of the input audio file so as to include information of digitized samples of an audio clip over a period of time and the characteristic attribute is a number of zero crossings of the digitized samples in the chunk. 24. The apparatus according to claim 23, wherein the criterion used by the at least one computer is satisfied if the zero crossing count of a sampled chunk is a local maximum. 25. The method according to claim 22, wherein the at least one computer is configured to identify the chunks whose characteristic attribute values satisfy the criterion on an increasing sampling basis. 26. The apparatus according to claim 25, wherein the increasing sampling basis used by the at least one computer is a quadratically increasing sample basis. 27. The apparatus according to claim 25, wherein the increasing sampling basis used by the at least one computer is an exponentially increasing sample basis. 28. The apparatus according to claim 25, wherein the at least one computer is configured to identify the chunks whose characteristic attribute values satisfy the criterion by incrementing through the chunks at a specified velocity and acceleration. 29. The apparatus according to claim 28, wherein the velocity used by the at least one computer is the number of chunks between local maxima in the input profile and the acceleration used by the at least one computer is the change in velocity divided by the number of chunks over which the change occurs. 30. The apparatus according to claim 22, wherein the at least one computer is configured to identify potential matches for the input audio file with the plurality of reference profiles by identifying potential mini-matches by retrieving profile hooks associated with nodes in the sparse binary tree. 31. The apparatus according to claim 30, wherein the at least one computer is configured to identify the mini-matches for individual reference profiles corresponding to the retrieved profile hooks by comparing a number of chunks of the input profile and corresponding chunks of the reference profile and identifying a mini-match if corresponding chunks of the reference profile substantially matches those of the input profile. 32. The apparatus according to claim 31, wherein the at least one computer is configured to identify the mini-matches by identifying a non-full mini-match using a best matching one of the reference profiles with the input profile if none of the reference profiles identified by the profile hooks substantially matches those of the input profile. 33. The apparatus according to claim 32, wherein the at least one computer is configured to identify the potential matches by merging any mini-matches and non-full mini-matches corresponding to the same reference profile and having an offset into the input profile at which the reference profile begins within a specified tolerance. 34. The apparatus according to claim 33, wherein the at least one computer is configured so that if a mini-match is merged with a non-full mini-match, then the merged entity is referred to as a mini-match. 35. The apparatus according to claim 34, wherein the at least one computer is configured to identify the input audio file as a spoof if all mini-matches identified for the input profile do not refer to the same reference profile. 36. The apparatus according to claim 35, wherein the at least one computer is configured to not identify the input audio file as a spoof if the sum of the total audio time covered by the mini-matches is less than a specified threshold value. 37. The apparatus according to claim 36, wherein the at least one computer is configured to not identify the input audio file as a spoof even if the sum of the total audio time covered by the mini-matches is not less than the specified threshold value of if any of the mini-matches has an associated error per second value that is greater than a first specified maximum value. 38. The apparatus according to claim 37, wherein the at least one computer is configured to search for the match so as to result in a best match being found if a percentage of the input profile and the reference profile covered by the best match exceeds some minimum value after ignoring all non-full mini-matches, ignoring mini-matches having an en-or per second value that is greater than a second specified maximum value, and taking into consideration any other programmed criteria.
※ AI-Helper는 부적절한 답변을 할 수 있습니다.