A prompt does more than request information. Its category words, place names, and implied task can quietly determine which version of a business becomes easiest for the system to find.
In a composite scenario, “Find a wellness clinic in Bangkok” returned a small independent provider. “Find a private hospital in Bangkok offering the same treatments” returned the same English name, now attached to a larger medical identity. A third prompt asked for “the best place for treatment near the district” and produced a recommendation that combined the clinic’s address with the larger facility’s category.
The scenario is built around Study object A. The clinic’s Thai name has several English transliterations and resembles the name of a larger medical facility. Its website describes treatments, while map listings and directories apply broader categories unevenly. The underlying case remained the same, but the prompt wording changed and the apparent route through the visible material changed with it.
A few words can redraw the search field
Prompt comparisons are often described as though researchers merely rephrase a sentence. That understates what changes. Replacing “clinic” with “hospital” modifies the category boundary. Adding “near” introduces a geographic filter. Asking for a recommendation invites evaluation, while asking for an address asks for identification. A comparison prompt may encourage the system to assemble alternatives that would never appear in a direct lookup.
The laboratory treats prompt wording as part of the observation conditions because wording helps define which entities and sources can plausibly enter the answer.
This does not mean every changed result was caused by one changed word. Generated systems vary between runs, and their internal retrieval steps are not fully visible. A matched prompt comparison offers a narrower claim: when the team preserves the remaining conditions and varies a defined element, it can observe whether entity identification, visible citations, categories, locations, or apparent retrieval paths change with it.
The procedure works best when the contrast is small enough to inspect. If one prompt asks for a Bangkok clinic and another asks for luxury medical tourism across Thailand, the resulting difference is unsurprising and difficult to attribute. A more useful pair might preserve the treatment, district, language, and request type while changing “clinic” to “hospital.” Even then, the laboratory describes association rather than hidden causation.
The exact prompt is preserved. So are punctuation, transliteration, language, named location, requested task, model context, observation date, and visible citations. A paraphrase in the research notes would be easier to read, but it would also erase the thing under examination.
Service language can pull an entity into a broader category
In the composite clinic scenario, narrow service wording returns the independent provider together with its website, map profile, and treatment page. When the prompt substitutes a broader medical category, the visible source set changes. Directories with platform-generated labels become more prominent, or the similarly named larger facility appears among the supporting pages.
A common temptation is to say that the model “misunderstood” the business. The data are usually thinner than that. The answer may have selected the wrong entity, selected the right entity and stretched its category, or blended attributes from both. Those are different failures.
Prompt wording helps separate them. If the clinic remains correctly identified across category variants but the label expands from clinic to hospital, the problem may sit in attribution or final wording. If the address, ownership, and service claims also move toward the larger facility, entity identification has probably shifted. “Probably” matters here; the apparent retrieval path remains an inference from the preserved record.
Service terms can behave like magnets. A word such as “hospital,” “school,” “resort,” or “agency” may coincide with answers that favour entities already classified that way in directories and map systems. A narrower term may accompany the business’s own pages instead. The prompt does not prove why a particular source appeared, but it changes the request conditions under which the observed source set was produced.
This is especially consequential in Thailand, where local businesses can be described through Thai categories, several English equivalents, and platform labels that are only roughly aligned. A Thai phrase may preserve a distinction that an English category collapses. The opposite can occur when an English trade term is more precise than the category used on local listings.
The laboratory does not presume which language is superior. It records what each formulation retrieves.
Location wording can split or merge branches
Study object B supplies a second composite setting: a restaurant group with branches in Bangkok and a neighbouring province. Maps, directories, social pages, and booking platforms use inconsistent branch labels. A similarly named venue sits in the same discovery category.
A direct prompt using the full Bangkok branch name may retrieve the correct map record. Remove the branch label and ask for “the restaurant near Bangkok,” and the neighbouring province can enter. Replace the place name with a district, landmark, or “near the airport,” and the answer may reorganise the candidate set again.
Local geography is rarely a neat stack of boxes. Residents may use a city name loosely for surrounding areas. Booking platforms may display the province, while social pages prefer a familiar district or tourist label. A branch can therefore appear geographically plausible in several descriptions, though only one matches its formal address.
In one typical pattern, the answer identifies the group correctly but cites a branch page from the neighbouring province. In another, it selects the Bangkok listing and borrows the opening status of the similarly named venue. A third prompt using a landmark returns the correct branch, then says it belongs to the wrong district. The small mistake about the district is useful; it shows that retrieval and geographic wording can diverge even after the entity appears settled.
The laboratory compares these prompts as distinct discovery situations. “Where is this branch?” tests identification and location. “Which branch is closest?” adds comparison. “Recommend a restaurant near this place” invites the system to judge fit as well as proximity. The wording changes the task, and the task changes what evidence the answer needs.
A source that directly supports an address may say nothing about closeness. A branch page may confirm identity without supporting the claim that the location is convenient for a named landmark. Prompt wording can therefore alter the claim-source burden even when the same page appears.
Recommendation prompts introduce extra claims
Recommendation language is particularly productive of drift because it asks the system to do more than identify an entity. Words such as “best,” “popular,” “good for families,” or “worth visiting” invite evaluative statements. The visible sources may support the business name, category, and location while offering little support for the judgment.
The Four Source Relationships typology makes this expansion visible. Direct support exists when a source supports the claim as stated. Stretched support appears when a page establishes a narrower fact but the answer turns it into a broader recommendation. Borrowed identity occurs when praise, reputation, location, or service details belonging to another entity cross into the selected business. Unsupported arrival describes a claim for which no visible source in the observation provides support.
Consider the restaurant group. A booking page may show that a branch accepts reservations and serves a particular cuisine. Calling it “one of the most popular family restaurants near Bangkok” asks the source to carry several additional claims. The family suitability may be inferred from photographs. Popularity may come from no visible source at all. “Near Bangkok” may be geographically loose. The citation looks relevant because it identifies the venue, yet its support is stretched across the rest of the sentence.
A comparison prompt can create a different problem. Asked to compare two venues, the system may import a category or reputation phrase from one and place it beside the other. The error is easier to miss because both businesses belong in the answer. Their proximity on the page becomes a channel for borrowed identity.
The laboratory therefore does not evaluate citations globally. Each claim is separated and tested against the visible material. Changing the prompt from lookup to recommendation often increases the number and variety of claims, even when the named entity remains the same.
This has a practical consequence for visibility checks. A business may look accurately represented under a branded lookup and become unstable under a service recommendation. The second prompt is not merely a noisier version of the first. It asks the system to construct a different kind of answer from a wider and more demanding evidence set.
Following the apparent path without inventing it
A retrieval path is a reconstruction of the pages, listings, query interpretations, and source relationships that may have guided an answer. It is based on visible evidence because the system’s complete internal process is unavailable.
Prompt comparison can strengthen that reconstruction. If a category word changes and a new directory source appears beside a broader business label, the two observations fit a plausible interpretation. If a branch term is removed and the answer begins citing a group-level page, that shift may help explain why location details become mixed.
Still, the laboratory avoids turning sequence into proof. The changed prompt and changed source may coincide without one directly causing the other. An undisclosed source could have influenced both answers. Ordinary run variation may also account for part of the difference.
For that reason, matched comparisons are renewed rather than read once. The team looks for returning relationships across repeated runs, models, languages, and observation occasions where appropriate. A pattern that reappears becomes more informative, though it remains bounded by the preserved conditions.
Cross-language comparisons require special care. A Thai and English prompt can express the same broad intent while differing in category precision, place hierarchy, politeness, or implied audience. Literal translation may produce an unnatural query that no ordinary user would ask. The laboratory records the formulations as matched discovery intents, not as perfectly equivalent strings.
Sometimes the most useful result is that no stable path appears. One formulation may alternate between two entities, while another reliably returns one branch but varies its citations. That asymmetry suggests different kinds of instability. It does not need to be cleaned into a single story.
Limits of prompt comparison
The method cannot reveal private retrieval infrastructure, hidden ranking logic, undisclosed intermediate steps, or every source used internally by the model. Visible citations may represent only part of the material that shaped the output. Apparent retrieval paths therefore remain inferred.
A small set of prompt pairs cannot establish a universal rule about words such as “best,” “near,” or “hospital.” Their effects depend on the entity, language, geography, public source environment, and model context. The laboratory reports what changed under recorded conditions and avoids converting those observations into general percentages or thresholds.
Public information can also change between runs. A map category may be corrected, a branch page rewritten, or an old directory entry removed. When prompt comparisons occur on different observation dates, those changes can complicate interpretation. The record preserves dates and visible source states where possible.
The prompt itself can contain an error. Asking for a business in the wrong province may cause a system to accommodate the premise rather than challenge it. That behaviour is worth studying, but it differs from an unprompted location mistake. The laboratory marks the false premise instead of folding it into the same category.
Most importantly, wording effects should not be mistaken for complete control. A carefully phrased prompt can reduce ambiguity in one observation and fail in the next. The research value lies in identifying which formulations expose or suppress a recurring entity and source problem.
A business does not possess one fixed generated identity waiting to be retrieved. Its apparent identity is assembled under conditions, and the wording of the question is one of those conditions. Change the question slightly, and a different set of public fragments may become available for the answer to join.