TY - JOUR
T1 - The role of spatial frequency channels in letter identification
AU - Majaj, Najib J.
AU - Pelli, Denis G.
AU - Kurshan, Peri
AU - Palomares, Melanie
N1 - Funding Information:
This project began as Peri Kurshan's Westinghouse project, while she was in high school. She was supposed to confirm Solomon and Pelli's (1994) guess that channel frequency scaled with letter frequency, which she failed to do, showing the opposite to be true, and becoming a semi-finalist in the 1997 Westinghouse National Science Talent competition. We thank Gordon Legge, as editor, and two anonymous reviewers for many helpful questions and suggestions, which helped clarify the issues. We thank Bart Farell, Marialuisa Martelli, Manoj Raghavan, Jamie Radner, and Cigdem Talgar for critical reading and helpful suggestions. Thanks to Jim Thomas and Lynn Olzak for sharing their thoughts on combining information across spatial frequency. Thanks to Josh Solomon for running extra analyses of the Solomon (2000) data to assess the noise additivity index. Preliminary versions of these results were presented at annual meetings of the Optical Society of America (1996, Rochester) and the Association for Research in Vision and Ophthalmology (1997, Fort Lauderdale). Supported by NIH grant EY04432 to Denis Pelli. Najib Majaj was supported in part by a grant from the Alfred P. Sloan Foundation. The authors can be reached by email: [email protected], [email protected], [email protected], [email protected].
PY - 2002
Y1 - 2002
N2 - How we see is today explained by physical optics and retinal transduction, followed by feature detection, in the cortex, by a bank of parallel independent spatial-frequency-selective channels. It is assumed that the observer uses whichever channels are best for the task at hand. Our current results demand a revision of this framework: Observers are not free to choose which channels they use. We used critical-band masking to characterize the channels mediating identification of broadband signals: letters in a wide range of fonts (Sloan, Bookman, Künstler, Yung), alphabets (Roman and Chinese), and sizes (0.1-55°). We also tested sinewave and squarewave gratings. Masking always revealed a single channel, 1.6±0.7 octaves wide, with a center frequency that depends on letter size and alphabet. We define an alphabet's stroke frequency as the average number of lines crossed by a slice through a letter, divided by the letter width. For sharp-edged (i.e. broadband) signals, we find that stroke frequency completely determines channel frequency, independent of alphabet, font, and size. Moreover, even though observers have multiple channels, they always use the same channel for the same signals, even after hundreds of trials, regardless of whether the noise is low-pass, high-pass, or all-pass. This shows that observers identify letters through a single channel that is selected bottom-up, by the signal, not top-down by the observer. We thought shape would be processed similarly at all sizes. Bandlimited signals conform more to this expectation than do broadband signals. Here, we characterize processing by channel frequency. For sinewave gratings, as expected, channel frequency equals sinewave frequency fchannel = f. For bandpass-filtered letters, channel frequency is proportional to center frequency fchannel ∝fcenter (log-log slope 1) when size is varied and the band (c/letter) is fixed, but channel frequency is less than proportional to center frequency fchannel∝fcenter2/3 (log-log slope 2/3) when the band is varied and size is fixed. Finally, our main result, for sharp-edged (i.e. broadband) letters and squarewaves, channel frequency depends solely on stroke frequency, fchannel/10c/deg = fstroke/10c/deg2/3, with a log-log slope of 2/3. Thus, large letters (and coarse squarewaves) are identified by their edges; small letters (and fine squarewaves) are identified by their gross strokes.
AB - How we see is today explained by physical optics and retinal transduction, followed by feature detection, in the cortex, by a bank of parallel independent spatial-frequency-selective channels. It is assumed that the observer uses whichever channels are best for the task at hand. Our current results demand a revision of this framework: Observers are not free to choose which channels they use. We used critical-band masking to characterize the channels mediating identification of broadband signals: letters in a wide range of fonts (Sloan, Bookman, Künstler, Yung), alphabets (Roman and Chinese), and sizes (0.1-55°). We also tested sinewave and squarewave gratings. Masking always revealed a single channel, 1.6±0.7 octaves wide, with a center frequency that depends on letter size and alphabet. We define an alphabet's stroke frequency as the average number of lines crossed by a slice through a letter, divided by the letter width. For sharp-edged (i.e. broadband) signals, we find that stroke frequency completely determines channel frequency, independent of alphabet, font, and size. Moreover, even though observers have multiple channels, they always use the same channel for the same signals, even after hundreds of trials, regardless of whether the noise is low-pass, high-pass, or all-pass. This shows that observers identify letters through a single channel that is selected bottom-up, by the signal, not top-down by the observer. We thought shape would be processed similarly at all sizes. Bandlimited signals conform more to this expectation than do broadband signals. Here, we characterize processing by channel frequency. For sinewave gratings, as expected, channel frequency equals sinewave frequency fchannel = f. For bandpass-filtered letters, channel frequency is proportional to center frequency fchannel ∝fcenter (log-log slope 1) when size is varied and the band (c/letter) is fixed, but channel frequency is less than proportional to center frequency fchannel∝fcenter2/3 (log-log slope 2/3) when the band is varied and size is fixed. Finally, our main result, for sharp-edged (i.e. broadband) letters and squarewaves, channel frequency depends solely on stroke frequency, fchannel/10c/deg = fstroke/10c/deg2/3, with a log-log slope of 2/3. Thus, large letters (and coarse squarewaves) are identified by their edges; small letters (and fine squarewaves) are identified by their gross strokes.
KW - Channels
KW - Contrast sensitivity function
KW - Identification
KW - Letters
KW - Low-frequency cut
KW - Masking
KW - Most sensitive channel
KW - Noise additivity
KW - Object recognition
KW - Scale dependence
KW - Scale invariance
KW - Sinewaves
KW - Spatial frequency
KW - Spatial vision
KW - Squarewaves
UR - http://www.scopus.com/inward/record.url?scp=0036235163&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=0036235163&partnerID=8YFLogxK
U2 - 10.1016/S0042-6989(02)00045-7
DO - 10.1016/S0042-6989(02)00045-7
M3 - Article
C2 - 11997055
AN - SCOPUS:0036235163
SN - 0042-6989
VL - 42
SP - 1165
EP - 1184
JO - Vision research
JF - Vision research
IS - 9
ER -