Visual processing finds colors, features, parts, wholes, spatial relations, and motions {vision, physiology}. Brain first extracts elementary perceptual units, contiguous lines, and non-accidental properties.
properties: sizes
Observers do not know actual object sizes but only judge relative sizes.
properties: reaction speed
Reaction to visual perception takes 450 milliseconds [Bachmann, 2000] [Broca and Sulzer, 1902] [Efron, 1967] [Efron, 1970] [Efron, 1973] [Taylor and McCloskey, 1990] [Thorpe et al., 1996] [VanRullen and Thorpe, 2001].
properties: timing
Location perception is before color perception. Color perception is before orientation perception. Color perception is 80 ms before motion perception. If people must choose, they associate current color with motion 100 ms before. Brain associates two colors or motions before associating color and motion.
processes: change perception
Brain does not maintain scene between separate images. Perceptual cortex changes only if brain detects change. Perceiving changes requires high-level processing.
processes: contrast
Retina neurons code for contrast, not brightness. Retina compares point brightness with average brightness. Retinal-nerve signal strength automatically adjusts to same value, whatever scene average brightness.
processes: orientation response
High-contrast feature or object movements cause eye to turn toward object direction {orientation response, vision}.
processes: voluntary eye movements
Posterior parietal and pre-motor cortex plan and command voluntary eye movements [Bridgeman et al., 1979] [Bridgeman et al., 1981] [Goodale et al., 1986]. Stimulating superior-colliculus neurons can cause angle-specific eye rotation. Stimulating frontal-eye-field or other superior-colliculus neurons makes eyes move to specific locations, no matter from where eye started.
information
Most visual information comes from receptors near boundaries, which have large brightness or color contrasts. For dark-adapted eye, absorbed photons supply one information bit. At higher luminance, 10,000 photons make one bit.
People lower and raise eyelids {blinking}| every few seconds.
purpose
Eyelids close and open to lubricate eye [Gawne and Martin, 2000] [Skoyles, 1997] [Volkmann et al., 1980]. Blinking can be a reflex to protect eye.
rate
Blinking rate increases with anxiety, embarrassment, stress, or distraction, and decreases with concentration. Mind inhibits blinking just before anticipated events.
perception
Automatic blinks do not noticeably change scene [Akins, 1996] [Blackmore et al., 1995] [Dmytryk, 1984] [Grimes, 1996] [O'Regan et al., 1999] [Rensink et al., 1997] [Simons and Chabris, 1999] [Simons and Levin, 1997] [Simons and Levin, 1998] [Wilken, 2001].
Vision maintains constancies: size constancy, shape constancy, color constancy, and brightness constancy {constancy, vision}. Size constancy is accurate and learned.
Scene features land on retina at distances {eccentricity, retina} {visual eccentricity} from fovea.
Visual features can blend {feature inheritance} [Herzog and Koch, 2001].
If limited or noisy stimuli come from space region, perception completes region boundaries and surface textures {filling-in}| {closure, vision}, using neighboring boundaries and surface textures.
perception
Filling-in always happens, so people never see regions with missing information. If region has no information, people do not notice region, only scene.
perception: conceptual filling-in
Brain perceives occluded object as whole-object figure partially hidden behind intervening-object ground {conceptual filling-in}, not as separate, unidentified shape beside intervening object.
perception: memory
Filling-in uses whole brain, especially innate and learned memories, as various neuron assemblies form and dissolve and excite and inhibit.
perception: information
Because local neural processing makes incomplete and approximate representations, typically with ambiguities and contradictions, global information uses marked and indexed features to build complete and consistent perception. Brain uses global information when local region has low receptor density, such as retina blindspot or damaged cells. Global information aids perception during blinking and eye movements.
processes: expansion
Surfaces recruit neighboring similar surfaces to expand homogeneous regions by wave entrainment. Contours align by wave entrainment.
processes: lateral inhibition
Lateral inhibition distinguishes and sharpens boundaries. Surfaces use constraint satisfaction to optimize edges and regions.
processes: spreading
Brain fills in using line completion, motion continuation, and color spreading. Brain fills areas and completes half-hidden object shapes. Blindspot filling-in maintains lines and edges {completion, filling-in}, preserves motion using area MT, and keeps color using area V4.
processes: surface texture
Surfaces have periodic structure and spatial frequency. Surface texture can expand to help filling in. Blindspot filling-in continues background texture using area V3.
processes: interpolation
Brain fills in using plausible guesses from surroundings and interpolation from periphery. For large damaged visual-cortex region, filling-in starts at edges and goes inward toward center, taking several seconds to finish [Churchland and Ramachandran, 1993] [Dahlbom, 1993] [Kamitani and Shimojo, 1999] [Pessoa and DeWeerd, 2003] [Pessoa et al., 1998] [Poggio et al., 1985] [Ramachandran, 1992] [Ramachandran and Gregory, 1991].
Stimuli blend if less than 200 milliseconds apart {flicker fusion frequency} [Efron, 1973] [Fahle, 1993] [Gowdy et al., 1999] [Gur and Snodderly, 1997] [Herzog et al., 2003] [Nagarajan et al., 1999] [Tallal et al., 1998] [Yund et al., 1983] [Westheimer and McKee, 1977].
People have different abilities to detect color radiance. Typical people {Standard Observer} have maximum sensitivity at 555 nm and see brightness {luminance, Standard Observer} according to standard radiance weightings at different wavelengths. Brightness varies with luminance logarithm.
In dim light, without focus on anything, black, gray, and white blobs, smaller in brighter light and larger in dimmer light, flicker on surfaces. In darkness, people see large-size regions slowly alternate between black and white. Brightest blobs are up to ten times brighter than background. In low-light conditions, people see three-degrees-of-arc circular regions, alternating randomly between black and white several times each second {variable resolution}. If eyes move, pattern moves. In slightly lighter conditions, people see one-degree-of-arc circular regions, alternating randomly between dark gray and light gray, several times each second. In light conditions, people see colors, with no flashing circles.
Flicker rate varies with activity. If you relax, flicker rate is 4 to 20 Hz. If flicker rate becomes more than 25 Hz, you cannot see flicker.
Flicker shows that sense qualities have elements.
causes
Variable-resolution size reflects sense-field dynamic building. Perhaps, fewer receptor numbers can respond to lower light levels. Perhaps, intensity modulates natural oscillation. Perhaps, rods have competitive inhibition and excitation [Hardin, 1988] [Hurvich, 1981].
Observers can look {visual search} for objects, features, locations, or times {target, search} in scenes or lists.
distractors
Other objects {distractor, search} are not targets. Search time is directly proportional to number of targets and distractors {set size, search}.
types
Searches {conjunction search} can be for feature conjunctions, such as both color and orientation. Conjunction searches {serial self-terminating search} can look at items in sequence until finding target. Speed decreases with number of targets and distractors.
Searches {feature search} can be for color, size, orientation, shadow, or motion. Feature searches are fastest, because mind searches objects in parallel.
Searches {spatial search} can be for feature conjunctions that have shapes or patterns, such as two features that cross. Mind performs spatial searches in parallel but can only search feature subsets {limited capacity parallel process}.
guided search theory
A parallel process {preattentive stage} suggests serial-search candidates {attentive stage} {guided search theory, search}.
Vision combines output from both eyes {binocular vision}|. Cats, primates, and predatory birds have binocular vision. Binocular vision allows stereoscopic depth perception, increases light reception, and detects differences between camouflage and surface. During cortex-development sensitive period, what people see determines input pathways to binocular cells and orientation cells [Blakemore and Greenfield, 1987] [Cumming and Parker, 1997] [Cumming and Parker, 1999] [Cumming and Parker, 2000].
One stimulus can affect both eyes, and effects can add {binocular summation}.
Visual-cortex cells {disparity detector} can combine right and left eye outputs to detect relative position disparities. Disparity detectors receive input from same-orientation orientation cells at different retinal locations. Higher binocular-vision cells detect distance directly from relative disparities, without form or shape perception.
People using both eyes do not know which eye {eye-of-origin} saw something [Blake and Cormack, 1979] [Kolb and Braun, 1995] [Ono and Barbieto, 1985] [Pickersgill, 1961] [Porac and Coren, 1986] [Smith, 1945] [Helmholtz, 1856] [Helmholtz, 1860] [Helmholtz, 1867] [Helmholtz, 1962].
Adaptation can transfer from one eye to the other {interocular transfer}.
Boundaries {contour, vision} have brightness differences and are the most-important visual perception. Contours belong to objects, not background.
curved axes
Curved surfaces have perpendicular curved long and short axes. In solid objects, short axis is object depth axis and indicates surface orientation. Curved surfaces have dark edge in middle, where light and dark sides meet.
completion
Mind extrapolates or interpolates contour segments to make object contours {completion, contour}.
When looking only at object-boundary part, even young children see complete figures. Children see completed outline, though they know it is not actually there.
crowding
If background contours surround figure, figure discrimination and recognition fail.
Two line segments can belong to same contour {relatability}.
Perception extends actual lines to make imaginary figure edges {subjective contour}|. Subjective contours affect depth perception.
Rods and cones {duplex vision} operate in different light conditions.
Vision has systems {photopic system} for daylight conditions.
Vision has systems {scotopic system} for dark or nighttime conditions.
Seeing at dusk {mesopic vision, dark} {twilight vision} is more difficult and dangerous.
Brain can find depth and distance {depth perception} {distance perception} in scenes, paintings, and photographs.
depth: closeness
Closer objects have higher edge contrast, more edge sharpness, position nearer scene bottom, larger size, overlap on top, and transparency. Higher edge contrast is most important. More edge sharpness is next most important. Position nearer scene bottom is more important for known eye-level. Transparency is least important. Nearer objects are redder.
depth: farness
Farther objects have smaller retinal size; are closer to horizon (if below horizon, they are higher than nearer objects); have lower contrast; are hazier, blurrier, and fuzzier with less texture details; and are bluer or greener. Nearer objects overlap farther objects and cast shadows on farther objects.
binocular depth cue: convergence
Focusing on near objects causes extraocular muscles to turn eyeballs toward each other, and kinesthesia sends this feedback to vision system. More tightening and stretching means nearer. Objects farther than ten meters cause no muscle tightening or stretching, so convergence information is useful only for distances less than ten meters.
binocular depth cue: shadow stereopsis
For far objects, with very small retinal disparity, shadows can still have perceptibly different angles {shadow stereopsis} [Puerta, 1989], so larger angle differences are nearer, and smaller differences are farther.
binocular depth cue: stereopsis
If eye visual fields overlap, the two scenes differ by a linear displacement, due to different sight-line angles. For a visual feature, displacement is the triangle base, which has angles at each end between the displacement line and sight-line, allowing triangulation to find distance. At farther distances, displacement is smaller and angle differences from 90 degrees are smaller, so distance information is imprecise.
binocular depth cue: inference
Inference includes objects at edges of retinal overlap in stereo views.
monocular depth cue: aerial perspective
Higher scene contrast means nearer, and lower contrast means farther. Bluer means farther, and redder means nearer.
monocular depth cue: accommodation
Focusing on near objects causes ciliary muscles to tighten to increase lens curvature, and kinesthesia sends this feedback to vision system. More tightening and stretching means nearer. Objects farther than two meters cause no muscle tightening or stretching, so accommodation information is useful only for distances less than two meters.
monocular depth cue: blur
More blur means farther, and less blur means nearer.
monocular depth cue: color saturation
Bluer objects are farther, and redder objects are nearer.
monocular depth cue: color temperature
Bluer objects are farther, and redder objects are nearer.
monocular depth cue: contrast
Higher scene contrast means nearer, and lower contrast means farther. Edge contrast, edge sharpness, overlap, and transparency depend on contrast.
monocular depth cue: familiarity
People can have previous experience with objects and their size, so larger retinal size is closer, and smaller retinal size is farther.
monocular depth cue: fuzziness
Fuzzier objects are farther, and clearer objects are nearer.
monocular depth cue: haziness
Hazier objects are farther, and clearer objects are nearer.
monocular depth cue: height above and below horizon
Objects closer to horizon are farther, and objects farther from horizon are nearer. If object is below horizon, higher objects are farther, and lower objects are nearer. If object is above horizon, lower objects are farther, and higher objects are nearer.
monocular depth cue: kinetic depth perception
Objects becoming larger are moving closer, and objects becoming smaller are moving away {kinetic depth perception}. Kinetic depth perception is the basis for judging time to collision.
monocular depth cue: lighting
Light and shade have contours. Light is typically above objects. Light typically falls on nearer objects.
monocular depth cue: motion parallax
While looking at an object, if observer moves, other objects moving backwards are nearer than object, and other objects moving forwards are farther than object. For the farther objects, objects moving faster are nearer, and objects moving slower are farther. For the nearer objects, objects moving faster are nearer, and objects moving slower are farther. Some birds use head bobbing to induce motion parallax. Squirrels move orthogonally to objects. While observer moves while looking straight ahead, objects moving backwards faster are closer, and objects moving backwards slower are farther.
monocular depth cue: occlusion
Objects that overlap other objects {interposition} are nearer, and objects behind other objects are farther {pictorial depth cue}. Objects with occluding contours are farther.
monocular depth cue: peripheral vision
At the visual periphery, parallel lines curve, like the effect of a fish eye lens, framing the visual field.
monocular depth cue: perspective
By linear perspective, parallel lines converge, so, for same object, smaller size means farther distance.
monocular depth cue: relative movement
If objects physically move at same speed, objects moving slower are farther, and objects moving faster are nearer, to a stationary observer.
monocular depth cue: relative size
If two objects have the same shape and are judged to be the same, object with larger retinal size is closer.
monocular depth cue: retinal size
If observer has previous experience with object size, object retinal size allows calculating distance.
monocular depth cue: shading
Light and shade have contours. Shadows are typically below objects. Shade typically falls on farther objects.
monocular depth cue: texture gradient
Senses can detect gradients by difference ratios. Less fuzzy and larger surface-texture sizes and shapes are nearer, and more fuzzy and smaller are farther. Bluer and hazier surface texture is farther, and redder and less hazy surface texture is closer.
properties: precision
Depth-calculation accuracy and precision are low.
properties: rotation
Fixed object appears to revolve around eye if observer moves.
factors: darkness
In the dark, objects appear closer.
processes: learning
People learn depth perception and can lose depth-perception abilities.
processes: coordinates
Binocular depth perception requires only ground plane and eye point to establish coordinate system. Perhaps, sensations aid depth perception by building geometric images [Poggio and Poggio, 1984].
processes: two-and-one-half dimensions
ON-center-neuron, OFF-center-neuron, and orientation-column intensities build two-dimensional line arrays, then two-and-one-half-dimensional contour arrays, and then three-dimensional surfaces and texture arrays [Marr, 1982].
processes: three dimensions
Brain derives three-dimensional images from two-dimensional ones by assigning convexity and concavity to lines and vertices and making convexities and concavities consistent.
processes: triangulation model
Animals continually track distances and directions to distinctive landmarks.
Adjacent points not at edges are on same surface and so at same distance {continuity constraint, depth}.
Scenes land on right and left eye with same geometric shape, so feature distances and orientations are the same {corresponding retinal points}.
Brain stimuli {cyclopean stimulus} can result only from binocular disparity.
One eye can find object-size to distance ratio {distance ratio} {geometric depth}, using three object points. See Figure 1.
Eye fixates on object center point, edge point, and opposite-edge point. Assume object is perpendicular to sightline. Assume retina is planar. Assume that eye is spherical, rotates around center, and has calculable radius.
Light rays go from center point, edge point, and opposite edge point to retina. Using kinesthetic and touch systems and motor cortex, brain knows visual angles and retinal distances. Solving equations can find object-size to distance ratio.
When eye rotates, scenes do not change, except for focus. See Figure 2. 3.
Calculating distances to space points
Vision cone receptors receive from a circular area of space that subtends one minute of arc (Figure 3). Vision neurons receive from a circular area of space that subtends one minute to one degree of arc.
To detect distance, neuron arrays receive from a circular area of space that subtends one degree of arc (Figure 4). For the same angle, circular surfaces at farther distances have longer diameters, bigger areas, and smaller circumference curvature.
Adjacent neuron arrays subtend the same visual angle and have retinal (and cortical) overlap (Figure 5). Retinal and cortical neuron-array overlap defines a constant length. Constant-length retinal-image size defines the subtended visual angle, which varies inversely with distance, allowing calculating distance (r = s / A) in one step.
Each neuron array sends to a register for a unique spatial direction. The register calculates distance and finds color. Rather than use multiple registers at multiple locations, as in neural networks or holography, a single register can place a color at the calculated distance in the known direction. There is one register for each direction and distance. Registers are not physical neuron conglomerations but functional entities.
Both eyes can turn outward {divergence, eye}, away from each other, as objects get farther. If divergence is successful, there is no retinal disparity.
Brain expands more distant objects in proportion to the more contracted retinal-image size, making apparent size increase with increasing distance {size-constancy scaling} {Emmert's law} {Emmert law}. Brain determines size-constancy scaling by eye convergence, geometric perspective, texture gradients, and image sharpness. Texture gradients decrease in size with distance. Image sharpness decreases with distance.
Two eyes can measure relative distance to scene point, using geometric triangulation {triangulation, eye}. See Figure 1.
comparison
Comparing triangulations from two different distances does not give more information. See Figure 2.
movement
Moving eye sideways while tracking scene point can calculate distance from eye to point, using triangulation. See Figure 3.
Moving eye sideways while tracking scene points calibrates distances, because other scene points travel across retina. See Figure 4.
Moving eye from looking at object edge to looking at object middle can determine scene-point distance. See Figure 5.
Moving eye from looking at object edge to looking at object other edge at same distance can determine scene-point distance. See Figure 6.
Scene features land on one retina point {uniqueness constraint, depth}, so brain stereopsis can match right-retina and left-retina scene points.
Various features {depth cue}| {cue, depth} signal distance. Depth cues are accommodation, colors, color saturation, contrast, fuzziness, gradients, haziness, distance below horizon, linear perspective, movement directions, occlusions, retinal disparities, shadows, size familiarity, and surface textures.
types
Non-metrical depth cues can show relative depth, such as object blocking other-object view. Metrical depth cues can show quantitative information about depth. Absolute metrical depth cues can show absolute distance by comparison, such as comparing to nose size. Relative metrical depth cues can show relative distance by comparison, such as twice as far away.
Vision has less resolution at far distances. Air has haze, smoke, and dust, which absorb redder light, so farther objects are bluer, have less light intensity, and have blurrier edges {aerial perspective}| than if air were transparent. (Air scatters blue more than red, but this effect is small except for kilometer distances.)
Brain perceives depth using scene points that stimulate right and left eyes differently {binocular depth cue} {binocular depth perception}. Eye convergences, retinal disparities, and surface-area sizes have differences.
surface area size
Brain can judge distance by overlap, total scene area, and area-change rate. Looking at surfaces, eyes see semicircles. See Figure 1. Front edge is semicircle diameter, and vision field above that line is semicircle half-circumference. For two eyes, semicircles overlap in middle. Closer surfaces make overlap less, and farther surfaces make overlap more. Total scene surface area is more for farther surfaces and less for closer surfaces. Movement changes perceived area at rate that depends on distance. Closer objects have faster rates, and farther objects have slower rates.
For fixation, both eyes turn toward each other {convergence, eye} {eye convergence} when objects are nearer than 10 meters. If convergence is successful, there is no retinal disparity. Greater eye convergence means object is closer, and lesser eye convergence means object is farther. See Figure 1.
Brain can judge surface relative distance by intensity change during movement toward and away from surface {intensity difference during movement}. See Figure 1.
moving closer
Moving from point to half that distance increases intensity four times, because eye gathers four times more light at closer radius.
moving away
Moving from point to double that distance decreases intensity four times, because eye gathers four times less light at farther radius.
moving sideways
Movement side to side and up and down changes intensity slightly by changing distance slightly. Perhaps, saccades and/or eyeball oscillations help determine distances.
memory
Experience with constant-intensity objects establishes distances.
accommodation
Looking at object while moving it or eye closer, or farther, causes lens-muscle tightening, or loosening, and makes more, or less, visual angle. If brain knows depth, movement toward and away can measure source intensity.
light ray
Scene points along same light ray project to same retina point. See Figure 2.
haze
Atmospheric haze affects light intensity. Haze decreases intensity proportionally with distance. Object twice as far away has half the intensity, because it encounters twice as many haze particles.
sound
Sound-intensity changes can find distances. Bats use sonar because it is too dark to see at night. Dolphins use sonar because water distorts light.
One eye can perceive depth {monocular depth cue}. Monocular depth cues are accommodation, aerial perspective, color, color saturation, edge, monocular movement parallax, occlusion, overlap, shadows, and surface texture.
Closer object can hide farther object {occlusion, cue}|. Perception knows many rules about occlusion.
Using both eyes can make depth and three dimensions appear {stereoscopic depth} {stereoscopy} {stereopsis}. Stereopsis aids random shape perception. Stereoscopic data analysis is independent of other visual analyses. Monocular depth cues can cancel stereoscopic depth. Stereoscopy does not allow highly unlikely depth reversals or unlikely depths.
Features farther away are smaller than when closer, so surfaces have larger texture nearby and smaller texture farther away {texture gradient}.
During fixations, eye is not still but drifts irregularly {drift, eye} {eye drift} through several minutes of arc, over several fovea cones.
During fixations, eye is not still but moves in straight lines {microsaccade} over 10 to 100 fovea cones.
Eyes scan scenes {scanning, vision} in regular patterns along outlines or contours, looking for angles and sharp curves, which give the most shape information.
During fixations, eye is not still but has tremor {eye tremor} {tremor, eye} over one or two fovea cones, as it also drifts.
After fixations lasting 120 ms to 130 ms, eye moves {saccade}|, in 100 ms, to a new fixation position.
brain
Superior colliculus controls involuntary saccades. Brain controls saccades using fixed vectors in retinotopic coordinates and using endpoint trajectories in head or body coordinates [Bridgeman et al., 1979] [Bridgeman et al., 1981] [Goodale et al., 1986].
movement
People do not have saccades while following moving objects or turning head while fixating objects.
transformation
When eye moves from one fixation to another, brain translates whole image up to 100 degrees of arc. World appears to stand still while eyes move, probably because motor signals to move eyes cancel perceptual retinal movement signals.
perception
Automatic saccades do not noticeably change scene [Akins, 1996] [Blackmore et al., 1995] [Dmytryk, 1984] [Grimes, 1996] [O'Regan et al., 1999] [Rensink et al., 1997] [Simons and Chabris, 1999] [Simons and Levin, 1997] [Simons and Levin, 1998] [Wilken, 2001].
Brain does not block input from eye to brain during saccades, but cortex suppresses vision during saccades {saccadic suppression}, so image blurs less. For example, people cannot see their eye movements in mirrors.
In land-vertebrate eyes, flexible lens focuses {accommodation, vision} image by changing surface curvature using eye ciliary muscles. In fish, an inflexible lens moves backwards and forwards, as in cameras. Vision can focus image on fovea, by making thinnest contour line and highest image-edge gradient [Macphail, 1999].
process
To accommodate, lens muscles start relaxed, with no accommodation. Brain tightens lens muscles and stops at highest spatial-frequency response.
distance
Far objects require no eye focusing. Objects within four feet require eye focusing to reduce blur. Brain can judge distance by muscle tension, so one eye can measure distance. See Figure 1.
Pinhole camera can focus scene, but eye is not pinhole camera. See Figure 2.
far focus
If accommodation is for point beyond object, magnification is too low, edges are blurry, and spatial-frequency response is lower, because scene-point light rays land on different retina locations, before they meet at focal point. Focal point is past retina.
near focus
If accommodation is for point nearer than object, magnification is too high, edges are blurry, and spatial-frequency response is lower, because scene-point light rays meet at focal point and then land on different retina locations. Focal point is in eye middle.
Right and left retinas see different images {retinal disparity} {binocular disparity}| [Dacey et al., 2003] [DeVries and Baylor, 1997] [Kaplan, 1991] [Leventhal, 1991] [MacNeil and Masland, 1998] [Masland, 2001] [Polyak, 1941] [Ramón y Cajal, 1991] [Rodieck et al., 1985] [Rodieck, 1998] [Zrenner, 1983].
correlation
Brain can correlate retinal images to pair scene retinal points and then find distances and angles.
fixation
Assume eye fixates on a point straight-ahead. Light ray from scene point forms horizontal azimuthal angle and vertical elevation angle with straight-ahead direction. With no eye convergence, eye azimuthal and elevation angles from scene point differ {absolute disparity}. Different scene points have different absolute disparities {relative disparity}.
When both eyes fixate on same scene point, eye convergence places scene point on both eye foveas at corresponding retinal points, azimuthal and elevation angles are the same, and absolute disparity is zero. See Figure 1. After scene-point fixation, azimuth and elevation angles differ for all other scene points. Brain uses scene-point absolute-disparity differences to find relative disparities to estimate relative depth.
horopter
Points from horopter land on both retinas with same azimuthal and elevation angles and same absolute disparities. These scene points have no relative disparity and so have single vision. Points not close to horopter have different absolute disparities, have relative disparity, and so have double vision. See Figure 2.
location
With eye fixation on far point between eyes and with eye convergence, if scene point is straight-ahead, between eyes, and nearer than fixation distance, point lands outside fovea, for both eyes. See Figure 3. For object closer than fixation plane, focal point is after retina {crossed disparity}.
With eye fixation on close point between eyes and eye convergence, if scene point is straight-ahead, between eyes, and farther than fixation distance, point lands inside fovea, for both eyes. For object farther than fixation plane, focal point is before retina {uncrossed disparity}.
Two eyes can measure relative distance to point by retinal disparity. See Figure 4.
motion
Retinal disparity and motion change are equivalent perceptual problems, so finding distance from retinal disparity and finding lengths and shape from motion changes use similar techniques.
Eye focuses at a distance, through which passes a vertical plane {fixation plane} {plane of fixation}, perpendicular to sightline. From that plane's points, eye convergence can make right and left eye images almost correspond, with almost no disparity. From points in a circle {Vieth-Müller circle} in that plane, eye convergence can make right and left eye images have zero disparity.
After eye fixation on scene point and eye convergence, an imaginary sphere {horopter} passes through both eye lenses and fixation point. Points from horopter land on both retinas with same azimuthal and elevation angles and same absolute disparities. These scene points have no relative disparity and so have single vision.
Brain fuses scene features that are inside distance from horopter {Panum's fusion area} {Panum fusion area} {Panum's fusional area}, into one feature. Brain does not fuse scene features outside Panum's fusional area, but features still register in both eyes, so feature appears double.
Color varies in energy flow per unit area {intensity, vision}. Vision can detect very low intensity. People can see over ten-thousand-fold light intensity range. Vision is painful at high intensity.
sensitivity
People can perceive one-percent intensity differences. Sensitivity improves in dim light when using both eyes.
receptors
Not stimulating long-wavelength or middle-wavelength receptor reduces brightness. For example, extreme violets are less bright than other colors.
temporal integration
If light has constant intensity for less than 100 ms, brain perceives it as becoming less bright. If light has constant intensity for 100 ms to 300 ms, brain perceives it as becoming brighter. If light has constant intensity for longer than 300 ms, brain perceives it as maintaining same brightness.
unchanging image
After people view unchanging images for two or three seconds, image fades and becomes dark gray or black. If object contains sharp boundaries between highly contrasting areas, object reappears intermittently.
bleaching
Eyes blinded by bright light recover in 30 minutes, as eye chemicals become unbleached.
If stimulus lasts less than 0.1 second, brightness is product of intensity and duration {Bloch's law} {Bloch law}.
Phenomenal brightness {brightness} {luminosity} relates to logarithm of total stimulus-intensity energy flux from all wavelengths. Surfaces that emit more lumens are brighter. On Munsell scale, brightness increases by 1.5 units if lumens double.
properties: reflectance
Surfaces that reflect different spectra but emit same number of lumens are equally bright.
properties: reflectivity
For spectral colors, brightness is logarithmic, not linear, with reflectivity.
factors: adaptation
Brightness depends on eye adaptation state. Parallel pathways calculate brightness. One pathway adapts to constant-intensity stimuli, and the other does not adapt. If two same-intensity flashes start at same time, briefer flash looks dimmer than longer flash. If two same-intensity flashes end at same time, briefer flash looks brighter than longer flash {temporal context effect} (Sejnowsky). Visual system uses visual-stimulus timing and spatial context to calculate brightness.
factors: ambient light
Brightness is relative and depends on ambient light.
factors: color
Light colors change less, and dark colors change more, as source brightness increases. Light colors change less, and dark colors change more, as color saturation decreases.
factors: mental state
Brightness depends on mental state.
brightness control
Good brightness control increases all intensities by same amount. Consciousness cannot control brightness directly. Television Brightness control sets "picture" level by increasing input-signal multiple {gain, brightness}. If gain is too low, high-input signals have low intensity and many low-input signals are same black. If gain is too high, low-input signals have high intensity and many high-input signals are same white. Television Brightness control increases ratio between black and white and so really changes contrast.
Detected light has difference between lowest and highest intensity {contrast, vision}.
contrast control
Good contrast control sets black to zero intensity while decreasing or increasing maximum intensity. Consciousness cannot control contrast directly. Television Contrast control sets "black level" by shifting lowest intensity to shift intensity scale. It adjusts input signal to make zero intensity. If input is too low, lower input signals all result in zero intensity. If input is too high, lowest input signal results in greater than zero intensity. Television Contrast control changes all intensities by same amount and so really changes brightness.
Mind can detect small intensity difference {contrast threshold} between light and dark surface area.
Larger objects have smaller contrast thresholds. Stimulus-size spatial frequency determines contrast-threshold reciprocal {contrast sensitivity function} (CSF). Contrast-threshold reciprocal is large when contrast threshold is small.
Visual system increases brightness contrast across edge {edge enhancement}, making lighter side lighter and darker side darker.
If eyes are still with no blinking, scene fades {fading} [Coppola and Purves, 1996] [Pritchard et al., 1960] [Tulunay-Keesey, 1982].
Human visual systems increase brightness contrast across edges, making lighter side lighter and darker side darker {Mach band}.
Leaving, arriving, or transmitted luminous flux in a direction divided by surface area {luminance}. Constant times sum over frequencies of spectral radiant energy times long-wavelength-cone and short-wavelength-cone spectral-sensitivity function [Autrum, 1979] [Segall et al., 1966]. Luminance relates to brightness. Lateral-geniculate-nucleus magnocellular-cell layers {luminance channel, LGN} measure luminance. Light power (radiance) and energy differ at different frequencies {spectral power distribution}, typically in 31 ranges 10 nm wide between 400 nm and 700 nm.
Light {luminous flux} can shine with a spectrum of wavelengths.
Light sources {illuminant} shine light on observed surfaces.
Light {radiant flux} can emit or reflect with a spectrum of wavelengths.
Radiant flux in a direction divided by surface area {radiance}.
Radiant flux divided by surface area {irradiance}.
Brain can perceive motion {motion perception} {motion detector}. Motion analysis is independent of other visual analyses.
properties: adaptation
Motion detector neurons adapt quickly.
properties: direction
Most cortical motion-detector neurons detect motion direction.
properties: distance
Most cortical motion-detector neurons are for specific distance.
properties: fatigue
Motion-detector neurons can fatigue.
properties: location
Most cortical motion-detector neurons are for specific space direction.
properties: object size
Most cortical motion-detector neurons are for specific object spot or line size. To detect larger or smaller objects, motion-detector neurons have larger or smaller receptive fields.
properties: rotation
To have right and left requires asymmetry, such as dot or shape. In rotation, one side appears to go backward while the other goes forward, which makes whole thing stand still.
properties: speed
Most cortical motion-detector neurons detect motion speed.
processes: brain
Area-V5 neurons detect different speed motions in different directions at different distances and locations for different object spot or line sizes. Motion detectors are for one direction, object size, distance, and speed relative to background. Other neurons detect expansion, contraction, and right or left rotation [Thier et al., 1999].
processes: frame
Spot motion from one place to another is like appearance at location and then appearance at another location. Spot must excite motion-detector neuron for that direction and distance.
processes: opposite motions
Motion detectors interact, so motion inhibits opposed motion, making motion contrasts. For example, motion in one direction excites motion detectors for that direction and inhibits motion detectors for opposite direction.
processes: retina image speed
Retinal radial-image speed relates to object distance.
processes: timing
Motion-detector-neuron comparison is not simultaneous addition but has delay or hold from first neuron to wait for second excitation. Delay can be long, with many intermediate neurons, far-apart neurons, or slow motion, or short, with one intermediate neuron, close neurons, or fast motion.
processes: trajectory
Motion detectors work together to detect trajectory or measure distances, velocities, and accelerations. Higher-level neurons connect motion detection units to detect straight and curved motions (Werner Reichardt). As motion follows trajectory, memory shifts to predict future motions.
Animal species have movement patterns {biological motion}. Distinctive motion patterns, such as falling leaf, pouncing cat, and swooping bat, allow object recognition and future position prediction.
Vision can detect that surface is approaching eye {looming response}. Looming response helps control flying and mating.
For moving objects, eyes keep object on fovea, then fall behind, then jump to put object back on fovea {smooth pursuit}. Smooth pursuit is automatic. People cannot voluntarily use smooth pursuit. Smooth pursuit happens even if people have no sensations of moving objects [Thiele et al., 2002].
Three-month-old infants understand {Theory of Body} that when moving objects hit other objects, other objects move. Later, infants understand {Theory of Mind Mechanism} self-propelled motion and goals. Later, infants understand {Theory of Mind Mechanism-2} how mental states relate to behaviors. Primates can understand that acting on objects moves contacted objects.
Head or body movement causes scene retinal displacement. Nearer objects displace more, and farther objects displace less {motion parallax}| {movement parallax}. If eye moves to right while looking straight-ahead, objects appear to move to left. See Figure 1.
Nearer objects move greater visual angle. Farther objects move smaller visual angle and appear almost stationary. See Figure 2.
movement sequence
Object sequence can change with movement. See Figure 3.
depth
Brain can use geometric information about two different positions at different times to calculate relative object depth. Brain can also use geometric information about two different positions at same time, using both eyes.
While observer is moving, nearer objects seem to move backwards while farther ones move in same direction as observer {monocular movement parallax}.
When viewing moving object through small opening, motion direction can be ambiguous {aperture problem}, because moving spot or two on-off spots can trigger motion detectors. Are both spots in window aperture same object? Motion detectors solve the problem by finding shortest-distance motion.
When people see objects, first at one location, then very short time later at another location, and do not see object anywhere between locations, first object seems to move smoothly to where second object appears {apparent motion}|.
Moving spot triggers motion detectors for two locations.
two locations and spot
How does brain associate two locations with one spot {correspondence problem, motion}? Brain follows spot from one location to next unambiguously. Tracking moving objects requires remembering earlier features and matching with current features. Vision can try all possible matches and, through successive iterations, find matches that yield minimum total distance between presentations.
location and spot
Turning one spot on and off can trigger same motion detector. How does brain associate detector activation at different times with one spot? Brain assumes same location is same object.
processes: three-dimensional space
Motion detectors are for specific locations, distances, object sizes, speeds, and directions. Motion-detector array represents three-dimensional space. Space points have spot-size motion detectors.
processes: speed
Brain action pathway is faster than object-recognition pathway. Brain calculates eye movements faster than voluntary movements.
constraints: continuity constraint
Adjacent points not at edges are at same distance from eye {continuity constraint, vision}.
constraints: uniqueness constraint
Scene features land on one retinal location {uniqueness constraint, vision}.
constraints: spatial frequency
Scene features have different left-retina and right-retina positions. Retina can use low resolution, with low spatial frequency, to analyze big regions and then use higher and higher resolutions.
If an image or light spot appears on a screen and then a second image appears 0.06 seconds later at a randomly different location, people perceive motion from first location to second location {phi phenomenon}. If an image or light spot blinks on and off slowly and then a second image appears at a different location, people see motion. If a green spot blinks on and off slowly and then a red spot appears at a different location, people see motion, and dot appears to change color halfway between locations.
Objects {luminance-defined object}, for example bright spots, can contrast in brightness with background. People see luminance-defined objects move by mechanism that differs from texture-defined object-movement mechanism. Luminance-defined objects have defined edges.
Objects {texture-defined object} {contrast-defined object} can contrast in texture with background. People see luminance-defined objects move by mechanism that differs from texture-defined object-movement mechanism. Contrast changes in patterned ways, with no defined edges.
Luminance changes indicate motion {first-order motion}.
Contrast and texture changes indicate motion {second-order motion}.
Incoming visual information is continuous flow {visual flow}| {optical flow, vision} {optic flow} that brain can analyze for constancies, gradients, motion, and static properties. As head or body moves, head moves through stationary environment. Optical flow reveals whether one is in motion or not. Optical flow reveals planar surfaces. Optical flow is texture movement across eye as animals move.
Optic flow has a point {focus of expansion} (FOE) {expansion focus} where horizon meets motion-direction line. All visual features seem to come out of this straight-ahead point as observer moves closer, making radial movement pattern {radial expansion} [Gibson, 1966] [Gibson, 1979].
Optic flow has information {tau, optic flow} that signals how long until something hits people {time to collision} (TTC) {collision time}. Tau is ratio between retinal-image size and retinal-image-size expansion rate. Tau is directly proportional to time to collision.
Mammals can throw and catch {Throwing and Catching}.
Animal Motions
Animals can move in direction, change direction, turn around, and wiggle. Animals can move faster or slower. Animals move over horizontal ground, climb up and down, jump up and down, swim, dive, and fly.
Predators and Prey
Predators typically intercept moving prey, trying to minimize separation. In reptiles, optic tectum controls visual-orientation movements used in prey-catching behaviors. Prey typically runs away from predators, trying to maximize separation. Animals must account for accelerations and decelerations.
Gravity and Motions
Animals must account for gravity as they move and catch. Some hawks free-fall straight down to surprise prey. Seals can catch thrown balls and can throw balls to targets. Dogs can catch thrown balls and floating frisbees. Cats raise themselves on hind legs to trap or bat thrown-or-bouncing balls with front paws.
Mammal Brain
Reticular formation, hippocampus, and neocortex are only in mammals. Mammal superior colliculus can integrate multisensory information at same spatial location [O'Regan and Noë, 2001]. In mammals, dorsal vision pathway indicates object locations, tracks unconscious motor activity, and guides conscious actions [Bridgeman et al., 1979] [Rossetti and Pisella, 2002] [Ungerleider and Mishkin, 1982] [Yabuta et al., 2001] [Yamagishi et al., 2001].
Allocentric Space
Mammal dorsal visual system converts spatial properties from retinotopic coordinates to spatiotopic coordinates. Using stationary three-dimensional space as fixed reference frame simplifies trajectories perceptual variables. Most motions are two-dimensional rather than three-dimensional. Fixed reference frame separates gravity effects from internally generated motions. Internally generated motion effects are straight-line motions, rather than curved motions.
Human Throwing and Shooting
Only primates can throw, because they can stand upright and have suitable arms and hands. From 45,000 to 35,000 years ago, Homo sapiens and Neanderthal Middle-Paleolithic hunter-gatherers cut and used wooden spears. From 15,000 years ago, Homo sapiens Upper Paleolithic hunter-gatherers cut and used wooden arrows, bows, and spear-throwers. Human hunter-gatherers threw and shot over long trajectories.
Human Catching
Geometric Invariants: Humans can catch objects traveling over long trajectories. Dogs and humans use invariant geometric properties to intercept moving objects.
Trajectory Prediction: To catch baseballs, eyes follow ball while people move toward position where hand can reach ball. In the trajectory prediction strategy [Saxberg, 1987], fielder perceives ball initial direction, velocity, and perhaps acceleration, then computes trajectory and moves straight to where hand can reach ball.
Acceleration Cancellation: When catching ball coming towards him or her, fielder must run under ball so ball appears to move upward at constant speed. In the optical-acceleration-cancellation hypothesis [Chapman, 1968], fielder motion toward or away from ball cancels ball perceived vertical acceleration, making constant upward speed. If ball appears to vertically accelerate, it lands farther than fielder. If it appears to vertically decelerate, it lands shorter. Ball rises until caught, because baseball is always above horizon, far objects are near horizon, and near objects are high above horizon.
Transverse Motion: Fielder controls transverse motion independently of radial motion. When catching ball toward right or left, fielder moves transversely to ball path, holding ball-direction and fielder-direction angle constant.
Linear Trajectory: In linear optical trajectory [McBeath et al., 1995], when catching ball to left or right, fielder runs in a curve toward ball, so ball rises in optical height, not to right or left. Catchable balls appear to go straight. Short balls appear to curve downward. Long balls appear to curve upward. Ratio between ball elevation and azimuth angles stays constant. Fielder coordinates transverse and radial motions. Linear optical trajectory is similar to simple predator-tracking perceptions. Dogs use the linear optical trajectory method to catch frisbees [Shaffer et al., 2004].
Optical Acceleration: Plotting optical-angle tangent changes over time, fielders appear to use optical-acceleration information to catch balls [McLeod et al., 2001]. However, optical trajectories mix fielder motions and ball motions.
Perceptual Invariants: Optical-trajectory features can be invariant with respect to fielder motions. Fielders catch fly balls by controlling ball-trajectory perceptions, such as lateral displacement, rather than by choosing how to move [Marken, 2005].
Brain can count {number perception}. Number perception can relate to time-interval measurement, because both measure number of units [Dehaene, 1997].
Number perception can add energy units to make sum {accumulator model} [Dehaene, 1997].
Number perception can associate objects with ordered-symbol list {numeron list model} [Dehaene, 1997].
Number perception can use mental images in arrays, so objects are separate {object file model} [Dehaene, 1997].
Vision detects smallest visual angle {visual acuity} {acuity, vision}.
If they look at too few lines {undersampling}, people estimate grating size incorrectly {aliasing}.
Visual angles land on retinal areas, which send to larger visual-cortex surface areas {cortical magnification}.
Good vision means that people can see at 20 feet what perfect-vision people can detect at 20 feet {twenty-twenty}. In contrast, 20-40 means that people can see at 20 feet what perfect-vision people can detect at 40 feet.
Scene features have diameter, whose ends define rays that go to eye-lens center to form angle {visual angle}.
Visual perceptual processes can detect local surface properties {surface texture} {texture perception} [Rogers and Collett, 1989] [Yin et al., 1997].
surface texture
Surface textures are point and line patterns, with densities, locations, orientations, and gradients. Surface textures have point and line spatial frequencies [Bergen and Adelson, 1988] [Bülthoff et al., 2002] [Julesz, 1981] [Julesz, 1987] [Julesz and Schumer, 1981] [Lederman et al., 1986] [Malik and Perona, 1990].
occipital lobe
Occipital-lobe complex and hypercomplex cells detect points, lines, surfaces, line orientations, densities, and gradients and send to neuron assemblies that detect point and line spatial frequencies [DeValois and DeValois, 1988] [Hubel and Wiesel, 1959] [Hubel and Wiesel, 1962] [Hubel, 1988] [Livingstone, 1998] [Spillman and Werner, 1990] [Wandell, 1995] [Wilson et al., 1990].
similar statistics
Similar surface textures have similar point and line spatial frequencies and first-order and second-order statistics [Julesz and Miller, 1962].
gradients
Texture gradients are proportional to surface slant, surface tilt, object size, object motion, shape constancy, surface smoothness, and reflectance.
gradients: object
Constant texture gradient indicates one object. Similar texture patterns indicate same surface region.
gradients: texture segmentation
Brain can use texture differences to separate surface regions.
speed
Brain detects many targets rapidly and simultaneously to select and warn about approaching objects. Brain can detect textural changes in less than 150 milliseconds, before attention begins.
machine
Surface-texture detection can use point and line features, such as corner detection, scale-invariant features (SIFT), and speeded-up robust features (SURF) [Wolfe and Bennett, 1997]. For example, in computer vision, the Gradient Location-Orientation Histogram (GLOH) SIFT descriptor uses radial grid locations and gradient angles, then finds principal components, to distinguish surface textures [Mikolajczyk and Schmid, 2005].
Surfaces have small regular repeating units {texel}.
Texture perception uses three local-feature types {texton}: elongated blobs {line segment, texton}, blob ends {end-point}, and blob crossings {texture, texton}. Visual-cortex simple and complex cells detect elongated blobs, terminators, and crossings.
search
Texture perception searches in parallel for texton type and density changes.
attention
Texture discrimination precedes attention.
For texton changes, brain calls attention processes.
similarity
If elongated blobs are same, because blob terminators total same number, texture is same.
statistics
Brain uses first-order texton statistics, such as texton type changes and density gradients, in texture perception.
Retina reference frame and object reference frame must match {viewpoint consistency constraint}.
Visual features can stay the same when observation point changes {viewpoint-invariance, vision}. Brain stores such features for visual recognition.
People have a reference point {visual egocenter} {egocenter, vision} on line passing through nosebridge and head center, for specifying locations and directions.
Brain first processes basic features {early vision}, then prepares to recognize objects and understand scenes, then recognizes objects and understands scenes.
Brain first processes basic features, then prepares to recognize objects and understand scenes {middle vision} {midlevel vision}, then recognizes objects and understands scenes.
Brain first processes basic features, then prepares to recognize objects and understand scenes, then recognizes objects and understands scenes {high-level vision}.
Outline of Knowledge Database Home Page
Description of Outline of Knowledge Database
Date Modified: 2022.0225