First, as clued by the title and the word “perspective” in the flavor, this scene must be reimagined from “above”. Reconstructing the scene allows each photograph to be assigned to its photographer, all of whom appear in the photos. Though some people’s clothes differ in various photos, their poses and positions in space do not change. From above, and with photographers assigned arbitrary letters, the layout is as follows (Hover over a photo to enlarge):
As clued by the ubiquity of dots and stripes (and the fact that some of the people change clothes), patterns matter. Every photograph shows either two people, one dressed in dots and one dressed in stripes, or a single person, who wears both dots and stripes, and what someone wears can depend on who is looking at them. This information can be represented as a graph. For reasons that will become apparent shortly, mark people holding checkered flags with double circles and the person holding a green flag with an extra arrow:
Given that green and checkered flags are symbols of start and finish, one might imagine walking this graph, starting at the green flag and trying to end at a checkered flag, recording the pattern encountered at each transition. This is, in fact, a deterministic finite automata (as clued by the flavor’s reference to “cogs in a machine” and confirmed by the title - both it and the team name it alludes to acronym to DFA). Analysis or experimentation reveals that it accepts only the following sequences, continuing to use ○ to represent dots and | to represent stripes:
If these sequences are treated as binary numbers (most-significant-bit first, as indicated by the permission of leading zeros), these sequences correspond to 1, 4, 9, 12, 15, 18, 20, and 22 - or equivalently
Preserving only those letters of the given phrase
CHECKSUM VALUE IN FEEDBACK NETWORK leaves
VALIDATOR, a reasonable response to the bizarre clue, a description of the mechanic, and the answer to the puzzle.