Submitted to: Naval Facilities Engineering Command Atlantic under hdr environmental, Operations and Construction, Inc. Contract No. N62470-10-d-3011, Task Order 03 Prepared By

WHISTLE AND SCHOOL CLASSIFICATION

Download 1.87 Mb.

Page	11/11
Date	03.03.2018
Size	1.87 Mb.
	#42189

1 2 3 4 5 6 7 8 9 10 11

Whistle Classification
School Classification
DISPLAYING THE RESULTS – THE ROCCA SIDEBAR
OUTPUT
Whistle Clip
Contour Points
Contour Features
School Stats

WHISTLE AND SCHOOL CLASSIFICATION

ROCCA uses a random forest classifier model based on the open-source statistical software package WEKA (http://www.cs.waikato.ac.nz/ml/weka/index.html). For more information on random forests and the WEKA package, please refer to Witten et al. (2011).

Whistle Classification

ROCCA measures the 50 features from each whistle contour. See Appendix A for a description of each of these variables.

ROCCA’s Random Forest classifier was trained using 50 variables measured from single-species schools of dolphins that had visual confirmation of species identity (see Oswald et al. 2007 and Oswald 2013 for details on the training datasets). During whistle classification, features measured from a whistle contour are run through the Random Forest model and each tree in the forest produces a species classification. Each tree can be considered 1 ‘vote’ for a given species classification. Votes are tallied over all trees and the whistle is classified as the species with the most ‘votes’. In addition to classifying individual whistles, encounters are classified based on the number of tree classifications for each species, summed over all of the whistles that were analyzed for that encounter.

The number of tree classifications for the predicted species is also used as a measure of the certainty of the classification. If a greater percentage of trees classifies the whistle as a particular species, then the classification is considered to have a higher degree of certainty. The ‘strong whistle threshold’ (specified in the ROCCA parameters window) is the percentage of trees that must classify the whistle as a given species in order for that classification to be considered reliable. If the percentage of trees classifying the whistle as a particular species falls below the strong whistle threshold, the whistle is classified as ambiguous. Similarly, encounters are classified as ambiguous unless the percentage of tree votes (summed over all of the whistles in the encounter) for the predicted species exceeds the ‘strong school threshold’ (see Section 3.2.2 for details on how to set the strong whistle and strong school thresholds). ‘

School Classification

The School Stats output file contains a list of possible species based on the classifier model used. There are two values stored for each species: the number of times a whistle has been classified to that species (also displayed on the ROCCA sidebar) and a cumulative total of the percentage of tree votes for the species (not displayed on the ROCCA sidebar). When a new whistle classification is saved to a School Stats file, the number of whistles classified as that species is increased by one and the percentage of tree votes for each species are added to the corresponding cumulative totals. ROCCA classifies an encounter as the species with the highest cumulative percentage of tree votes. If the highest cumulative percentage of tree votes falls below the school threshold (as specified in the ROCCA Parameters window, Section 3.2.2), the encounter is classified as Ambiguous.

Note! The species with the highest cumulative percentage of tree votes may be different than the species with the greatest number of whistle classifications (the value shown in the sidebar species list).

This page intentionally left blank.

DISPLAYING THE RESULTS – THE ROCCA SIDEBAR

The results of individual whistle classifications are grouped into encounters as defined by the user. Each group must be given a name, the encounter number. In addition to classifying individual whistles, ROCCA also classifies the overall encounter. The encounter classification is determined by summing the percentage of trees voting for each species over all of the whistles classified in that encounter. The species with the highest cumulative percentage of tree votes is the species classification for that encounter.

Figure 13. The ROCCA sidebar.

Encounter number: the current encounter number. This is the encounter number used when a new whistle is selected from the spectrogram display. Any combination of numbers and letters can be used to specify the encounter number.
Scroll buttons: allow you to scroll through the list of encounter numbers.
Classification results: displays a tally of the number of whistles classified as each species for the current encounter number. The list of possible species is based on the currently loaded classifier model. Species are denoted by the first letter of the genus and species (ex. Gm = Globicephala macrorhynchus). The number beside the species name indicates the number of whistles classified to that species. See Appendix B for a list of species included in the tropical Pacific and Atlantic classifiers, along with their genus-species codes.
School classification: displays the species classification for the current encounter.
Rename Encounter: renames the current encounter. Any previously saved output files that use the old encounter number in the filename will be renamed using the new encounter number.

Note! The information contained within the whistle Contour Stats file is NOT updated—you must modify any references to the old encounter number manually. Also note that you are not allowed to duplicate encounter numbers.

Save Encounter: overwrites the current School Stats file (as defined in the ROCCA Parameters window) with the current list of encounters and classification results. School classification results are also saved automatically every five minutes.
New Encounter: creates a new encounter.
Whistle Start: lists the time and frequency of the first user-selected point on the spectrogram.
Whistle End: lists the time and frequency of the second user-selected point on the spectrogram.

Note! Once you select the second point, the portion of the spectrogram in between the first and second points is captured in a new popup window.

OUTPUT

ROCCA saves three different files during whistle classification: whistle clip, contour points, and contour parameters. ROCCA will also save detection stats automatically every five minutes, as well as when the Save Detection button is clicked in the ROCCA sidebar (Section 7).

If a database module is being used, ROCCA will also save the data in two tables: Rocca_Whistle_Stats and Rocca_Detection_Stats.

Whistle Clip

ROCCA saves the whistle clip in a .wav file format to the output directory. The start and end points of the clip are defined by the start and end points that you originally selected in the spectrogram popup window. The channels saved to the clip file are specified in the ROCCA Parameters window (Section 3.2.1). ROCCA saves the file according to the filename defined in the ROCCA Parameters window (Section 3.2.4)

Contour Points

ROCCA saves the time/frequency pair for each extracted contour point in a .csv file in the output directory. The duty cycle, the energy in a frequency band around the peak frequency (as defined in the ROCCA Parameters window), and the RMS value of the amplitude are also saved. ROCCA saves the file according to the filename defined in the ROCCA Parameters window (Section 3.2.4).

Contour Features

ROCCA saves the features measured from the current contour, as well as the classification results (the percentage of trees voting for each species), in a .csv format Contour Stats file in the output directory. The information from each classified whistle is appended to the end of the file, and the file is never overwritten. Thus, this file will continue to collect classification information every time ROCCA is run.

Other information that is saved for each whistle includes the sound source, date and time, and encounter number. The end of each row in the Contour Stats file lists the name of the random forest model, the percentage of trees voting for each species, and a corresponding list of the species names. The species names are added to each row instead of to the header line because the header is created based on information from the first whistle contour analyzed. If you use a different classification model for the analysis of subsequent whistles, the species list may be different and may no longer match the header. By including the species list in the row, you are always able to verify which species were included in the classification algorithm for a particular whistle contour.

ROCCA saves the file according to the filename specified in the ROCCA Parameters window (Section 3.2.3). If a database module is being used, the data will also be saved to the Rocca_Whistle_Stats table.

School Stats

ROCCA saves classification results for all encounters in a .csv format School Stats file in the output directory. For each encounter, ROCCA includes the cumulative random forest tree vote totals for each species, a list of species in the classifier, and the overall school classification (based on the species with the highest cumulative tree vote total).

Each time the School Stats file is saved, either through the auto-save function or by pressing the Save Detection button, ROCCA overwrites the file in order to update any renamed encounters numbers. Since an encounter number can be renamed but never deleted, no information will be lost when overwriting an old file during a single PAMGuard session. HOWEVER, if PAMGuard is closed and restarted, the file will be overwritten with blank data and all prior information will be lost. ROCCA searches for the file at startup. If the file exists, you are given the opportunity to rename it before it is lost, and/or load the existing data back into the system.

Note! When examining the classification results for a particular encounter number, you should refer to the species list at the end of the row instead of the species listed in the header. The header information is taken from the first encounter number listed. If subsequent encounter numbers use different classification models, the included species may change and this change is not reflected in the header.

ROCCA saves the School Stats file according to the filename specified in the ROCCA Parameters window (Section 3.2.3). If a database module is being used, the data will also be saved to the Rocca_Detection_Stats table.

This page intentionally left blank

literature cited

Gillespie, D., J. Gordon, R. McHugh, D. McLaren, D.K. Mellinger, P. Redmond, A. Thode, P. Trinder, and D. Xiao. (2008). PAMGUARD: Semiautomated, open-source software for real-time acoustic detection and localization of cetaceans. Proceed. Instit. Acoust. 30, Part 5. 9 pp.

Oswald, J.N. (2013). Development of a Classifier for the Acoustic Identification of Delphinid Species in the Northwest Atlantic Ocean. Final Report. Submitted to HDR Environmental, Operations and Construction, Inc. Norfolk, Virginia under Contract No. CON005-4394-009, Subproject 164744, Task Order 003, Agreement # 105067. Prepared by Bio-Waves, Inc., Encinitas, California.

Oswald, J.N., S. Rankin, J. Barlow, and M.O. Lammers. (2007). A tool for real-time acoustic species identification of delphinid whistles. J. Acoust. Soc. Am. 122, 587-595.

Witten, I.H., E. Frank and M.A. Hall. (2011). Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufman Publishers, ISBN: 978-0-12-374856-0.

Appendix A:

Variables Measured by ROCCA

This page intentionally left blank.

Appendix A:
Variables Measured by ROCCA

Variable	Explanation
Begsweep	slope of the beginning sweep (1 = positive, -1 = negative, 0 = zero)
Begup	binary variable: 1=beginning slope is positive, 0=beginning slope is negative
Begdwn	binary variable: 1=beginning slope is negative, 0=beginning slope is positive
Endsweep	slope of the end sweep (1 = positive, -1 = negative, = 0 zero)
Endup	binary variable: 1=ending slope is positive, 0=ending slope is negative
Enddwn	binary variable: 1=ending slope is negative, 0=ending slope is positive
Beg	beginning frequency (Hz)
End	ending frequency (Hz)
Min	minimum frequency (Hz)
Dur	duration (sec)
Range	maximum frequency–minimum frequency (Hz)
Max	maximum frequency (Hz)
mean freq	mean frequency (Hz)
median freq	median frequency (Hz)
std freq	standard deviation of the frequency (Hz)
Spread	difference between the 75th and the 25th percentiles of the frequency
quart freq	frequency at one quarter of the duration (Hz)
half freq	frequency at one half of the duration (Hz)
Threequart	frequency at three quarters of the duration (Hz)
Centerfreq	(minimum frequency + (maximum frequency-minimum frequency))/2
rel bw	relative bandwidth: (max freq - min freq)/center freq
Maxmin	max freq/min freq
Begend	beg freq/end freq
Cofm	coefficient of frequency modulation: take 20 frequency measurements equally spaced in time, then subtract each frequency value from the one before it. COFM is the sum of the absolute values of these differences, all divided by 10,000
tot step	number of steps (10 percent or greater increase or decrease in frequency over two contour points)
tot inflect	number of inflection points (changes from positive to negative or negative to positive slope)
max delta	maximum time between inflection points
min delta	minimum time between inflection points
maxmin delta	max delta/min delta
mean delta	mean time between inflection points
std delta	standard deviation of the time between inflection points
median delta	median of the time between inflection points
mean slope	overall mean slope
mean pos slope	mean positive slope
mean neg slope	mean negative slope
mean absslope	mean absolute value of the slope
Posneg	mean positive slope/mean negative slope
perc up	percent of the whistle that has a positive slope
perc dwn	percent of the whistle that has a negative slope
perc flt	percent of the whistle that has zero slope
up dwn	number of inflection points that go from positive slope to negative slope
dwn up	number of inflection points that go from negative slope to positive slope
up flt	number of times the slope changes from positive to zero
dwn flt	number of times the slope changes from negative to zero
flt dwn	number of times the slope changes from zero to negative
flt up	number of times the slope changes from zero to positive
step up	number of steps that have increasing frequency
step dwn	number of steps that have decreasing frequency
step.dur	number of steps/duration
inflect.dur	number of inflection points/duration

Appendix B:

Genus Species Codes for the Tropical Pacific and Atlantic Classifiers

This page intentionally left blank.

Appendix B:
Genus Species Codes for the Tropical Pacific and Atlantic Classifiers

Tropical Pacific Classifier

Code	Scientific Name	Common name
Ambig	n/a	Ambiguous
Dc_Dd	Delphinus capensis and D. delphis	Long- and short-beaked common dolphin
Gm	Globicephala macrorhynchus	Short-finned pilot whale
Pc	Pseudorca crassidens	False killer whale
Sa	Stenella attenuata	Pantropical spotted dolphin
Sb	Steno bredanensis	Rough-toothed dolphin
Sc	Stenella coeruleoalba	Striped dolphin
Sl	Stenella longirostris	Spinner dolphin
Tt	Tursiops truncatus	Bottlenose dolphin

Atlantic Classifier

Code	Scientific Name	Common name
Ambig	n/a	Ambiguous
Dd	Delphinus delphis	Short-beaked common dolphin
Sc	Stenella coeruleoalba	Striped dolphin
Tt	Tursiops truncatus	Bottlenose dolphin
Sf	Stenella frontalis	Atlantic spotted dolphin
Gm	Globicephala macrorhynchus	Short-finned pilot whale

This page intentionally left blank.

Appendix C:

Description of CSV File Columns

This page intentionally left blank.

Appendix C:
Description of CSV File Columns

Contour Points File

Header	Description
Time [ms]	Time elapsed (since PAMGuard started)
Peak Frequency [Hz]	Frequency with the highest amplitude in the time slice
Duty Cycle, Energy, WindowRMS	Variables used internally by ROCCA

Contour Features File

Header	Description
Source	Source of acoustic data (sound card, filename, etc.)
Date-Time	Local (computer) date and time when the whistle was captured
Detection Count	Running tally of whistles captured since ROCCA was started. Number is incremented each time a whistle is sent to ROCCA
Encounter Number	Encounter number as specified by the user
Classified Species	Species classification of whistle
FREQMAX … STEPDUR	Features measured by ROCCA and used as input to the random forest classifier
Classifier	Name of classifier used
{no header}	The remaining columns contain the percentage of trees voting for each species. The final column contains the order of the species shown in the voting columns. For example, if the final column contains Gm-Dd-Sc-Sf-Tt, it indicates the first voting column contains the percentage of trees voting for Gm, the second voting column contains the percentage of trees voting for Dd, etc.

School Stats File

Header	Description
Encounter Number	Encounter number as specified by the user
{list of species, starting with Ambig}	The number of whistles classified as each species in the current encounter number
{list of species votes, starting with Ambig}	Percentage of trees voting for each species, summed over all whistles in the current encounter number. The species with the highest total percentage of votes is the overall encounter classification
Encounter Classification	Overall species classification for the current encounter

This page intentionally left blank.

1 References to sections within this user’s manual have been hyperlinked.

2 For questions and requests related to a new classifier based on custom data, please contact Dr. Julie Oswald at Bio-Waves, Inc. at: julie.oswald@bio-waves.net.

Directory: files -> 3913
files -> Integer programming and game theory
files -> 4 Integer Programming
files -> Bpa vehicle Window Repair Scenario #1 task: Procure vehicle window relacement. Objective
files -> Pop Warner History
files -> North Carolina Inclusion Initiative Mapping Where Children with ieps are Being Served Purpose
files -> Fall 2013 Spring 2014 Program Data: Standard 1 Exhibit 4d
files -> Hanban – asia society confucius classrooms network 2010 request for proposal
files -> Northern England’s set-jetting locations
3913 -> Social Movement Theory: Past, Presence & Prospect

Download 1.87 Mb.

Share with your friends:

1 2 3 4 5 6 7 8 9 10 11

Submitted to: Naval Facilities Engineering Command Atlantic under hdr environmental, Operations and Construction, Inc. Contract No. N62470-10-d-3011, Task Order 03 Prepared By

WHISTLE AND SCHOOL CLASSIFICATION

WHISTLE AND SCHOOL CLASSIFICATION

Whistle Classification

School Classification

DISPLAYING THE RESULTS – THE ROCCA SIDEBAR

OUTPUT

Whistle Clip

Contour Points

Contour Features

School Stats

literature cited