A recommendation can sound well sourced because its factual scaffolding is cited, while the decisive judgment—best, trusted, ideal, highly regarded—enters by another route entirely.
A generated answer recommended a restaurant for “consistently excellent local dining” and displayed three citations. One source confirmed the address. Another listed the menu and opening hours. The third belonged to a similarly named venue in another province and contained several favourable customer comments. None established consistency, local standing, or why this restaurant should outrank nearby alternatives.
This was a composite scenario assembled from recurring recommendation patterns studied by the laboratory. The answer also misspelled one branch name and gave the correct closing time only on weekdays. Those smaller defects were useful. They discouraged the tempting interpretation that the system had reached a stable, deeply researched judgment and merely failed to show its evidence.
Recommendation sentences contain several kinds of claim
A recommendation rarely consists of one claim. “This is a highly regarded riverside restaurant, ideal for families and known for attentive service” may look like a single descriptive unit, but it contains an entity claim, a location claim, an audience judgment, a reputation claim, and a service evaluation.
The visible sources may support these parts unevenly. A map page can establish location. The restaurant’s own website may describe family seating. A booking platform may label the venue as riverside. None necessarily supports “highly regarded” or “attentive service,” especially when the answer offers those judgments as settled facts rather than attributed opinions.
A recommendation claim is an evaluative statement because it tells the reader how a business should be valued, selected, or compared. The evidence required depends on the exact judgment. A source that proves a restaurant exists does not, for that reason, prove that it is good.
The laboratory therefore breaks recommendation answers into claim-sized units before assessing citations. This can feel pedantic until the central recommendation disappears under examination. In the composite restaurant case, the sources established that the venue operated, served regional dishes, accepted reservations, and had a branch in Bangkok. The decisive phrases—“one of the safest choices,” “consistently excellent,” and “preferred by local families”—remained weakly supported or unsupported.
Recommendation prose often borrows authority from nearby factual detail. Correct opening hours and a recognisable address make the evaluation sound researched. The answer’s factual base acts like a sturdy table on which an unsupported trophy has been placed.
The source may prove that the chair exists while the answer quietly awards it first place.
Applying the Four Source Relationships to praise
The laboratory uses the Four Source Relationships typology for individual claims: direct support, stretched support, borrowed identity, and unsupported arrival. These categories are qualitative descriptions of observed source relationships. They are not ratings of recommendation quality.
Direct support occurs when the source supports the evaluative claim as stated. This could happen when an answer accurately attributes a specific award, ranking, or published assessment to the page where it appears. Even then, the wording matters. A source calling a venue “editor’s choice for riverside dining” would not directly support the broader claim that it is the city’s best restaurant.
Stretched support is common in generated recommendations. A source may contain positive customer comments, while the answer converts them into a stable reputation claim. A venue described by its own website as “family-friendly” becomes “a favourite among local families.” A listing with a high platform rating becomes “widely trusted,” though the answer does not state the platform, review period, or basis of comparison.
Borrowed identity appears when praise concerning another entity enters the recommendation. In the composite scenario, favourable comments on a page for a similarly named provincial restaurant appeared compatible with the generated description of the Bangkok branch. The page’s atmosphere, service, and riverside view belonged to the other venue.
Unsupported arrival is the classification used when no visible source in the preserved observation supports the claim. “A safe choice for discerning travellers” belonged here. So did the claim that the restaurant was “known for consistency across all branches.” Those statements might have reflected undisclosed material, a model inference, or generic recommendation language. The observation could show only that visible support was absent.
The distinction between stretched support and unsupported arrival can be uncomfortable. If a source contains one positive review, does that partially support “consistently excellent”? The laboratory generally treats the relationship as stretched when a recognisable, narrower basis exists. Consistency is still unsupported as a temporal claim, but the positive evaluation did not arrive from nowhere. The classification records the overextension.
This is not a numerical exercise. The team does not count citations and calculate a confidence score. A single direct source can support a precise statement better than several vaguely relevant pages. The question remains local: what does this source establish about this claim?
How generic recommendation language enters the answer
Some recommendation phrases appear tailored while carrying little entity-specific information. “A solid option,” “well worth considering,” “known for quality,” and “ideal for visitors seeking an authentic experience” can fit thousands of businesses. Their fluency disguises their interchangeability.
In repeatable runs preserved by the laboratory, such phrases attached themselves to different entities even when the visible sources contained mainly operational facts. The wording changed, yet the function remained stable: the answer closed the gap between identifying a business and endorsing it.
That gap matters because users rarely ask only for a list of existing venues. They ask where to go, which provider to choose, or what is best for a particular need. A system that returned only addresses and categories would feel incomplete. The generated response therefore has pressure to evaluate, even where its visible evidence is better suited to identification.
The composite restaurant scenario showed three visible forms of expansion. First, the answer widened self-description: “welcoming space” became “known for warm hospitality.” Second, it generalised from platform material: a set of favourable comments became “consistently praised.” Third, it used recommendation language with no visible local basis: “an excellent choice for first-time visitors.”
The laboratory cannot confirm that these forms correspond to internal stages. They are descriptions of the relationship between output and record. The same phrase might have been influenced by an undisclosed page, a broader learned association, or wording conventions within generated recommendations.
Still, genericity provides a useful diagnostic. When an evaluative phrase remains equally plausible after the business name is replaced, it deserves closer inspection. This does not prove the claim false. It indicates that the wording may be doing more persuasive work than the visible evidence can carry.
A further complication is attribution. “Reviewers praise the service” can be checked against reviews. “The service is attentive” removes the attribution and presents the evaluation as the answer’s own settled description. That small grammatical shift changes the claim-source relationship. The evidence has not improved; its uncertainty has been edited out.
Comparison answers create hidden ranking claims
A recommendation becomes more demanding when several businesses are compared. Selecting one venue as “best for families” implies a relationship across alternatives, even when the answer never states the comparison procedure.
In a matched set of prompts using the composite restaurant scenario, the laboratory asked for a venue suitable for a family meal, a quiet business dinner, and a short visit near a transport connection. The same venue appeared under all three formulations. Its supporting sources established a broad menu, reservations, and a convenient address. They did not establish quietness, suitability for children, or superiority to nearby options.
The repeated selection might suggest a stable advantage. It could also reflect stronger retrievability. The venue had more English-language pages, more consistent branch naming, and more platform coverage than several alternatives represented in the composite record. Those conditions made the business easier to assemble into an answer, but ease of retrieval was not the same as suitability.
A comparison claim requires evidence about alternatives because “best” describes a relation, not an isolated property. Without a visible comparison basis, the answer may be reporting prominence in the retrieval environment while sounding like an assessment of business quality.
This is especially relevant for Thai organisations whose public information is unevenly distributed across languages. A venue with concise English descriptions may appear more recommendable in an English prompt because the system can describe it with fewer unresolved steps. Another business may fit the user’s need better while offering thinner, fragmented, or Thai-only source material.
The laboratory treats that explanation cautiously. The record can show which sources were visible and which entities returned. It cannot establish that richer English coverage caused the recommendation. The pattern is compatible with that interpretation, and renewed runs can test whether it returns under Thai prompts, quoted names, narrower locations, or altered comparison wording.
When the same entity remains preferred despite changes in user need, the team examines whether the recommendation is genuinely condition-sensitive. Sometimes the answer swaps only the final adjective. The business stays fixed; “family-friendly” becomes “convenient for professionals,” then “accessible for visitors.” That is a warning that the recommendation may have been selected before the suitability language was composed, but it does not establish an internal sequence.
What readers can verify in a recommendation record
A useful review begins by preserving the answer as an observation: prompt, generated wording, visible citations, language, model context, observation date, and run conditions. Screenshots can matter because citation placement and displayed snippets may change.
The recommendation should then be divided into identity, operational, descriptive, evaluative, and comparative claims. These labels are practical rather than canonical categories. Their purpose is to stop one well-supported fact from sheltering several unsupported judgments.
For each claim, the reader can ask which entity the source concerns, whether the source states the claim, and whether the answer has widened its scope. An address on a map page may be direct support. A platform category may offer stretched support for a broader business label. Praise from another branch may become borrowed identity. A superlative without visible basis becomes unsupported arrival.
This process does not require hostility toward recommendation systems. It requires resisting a common visual shortcut: citation present, therefore sentence supported. Citations are evaluated claim by claim because nearby statements can differ in entity, scope, and evidential status.
The practical consequence is sharper than “check the sources.” A business can appear accurately identified and still be recommended for invented reasons. It can also be recommended for real attributes borrowed from another branch or another organisation. Visibility, identity accuracy, and recommendation support need separate examination.
Limits of the visible record
The method cannot reveal every source used internally by a model. A claim classified as unsupported arrival means no visible source in the preserved observation supports it. It does not prove that no supporting material exists anywhere.
Nor can the laboratory infer recommendation quality from source support alone. A directly supported claim may rely on a weak or self-interested source. An unsupported claim may happen to be true. The classification describes the visible claim-source relationship, not the ultimate truth or usefulness of the recommendation.
Repeated runs add pattern information but do not remove this boundary. If several models call the same restaurant “highly regarded,” that agreement is an observation. The systems may rely on similar public materials, shared category assumptions, or generic wording habits. Agreement alone does not confirm the reputation claim.
The composite scenario also cannot establish why the restaurant was selected over its competitors. The visible record permits several explanations: source abundance, naming consistency, location fit, platform prominence, or an undisclosed comparison signal. The laboratory can vary prompts and inspect what changes, but private ranking and retrieval logic remain unavailable.
The defensible conclusion is narrower. The answer’s factual citations did not support all of its evaluative and comparative language. Some praise stretched narrower evidence, some belonged to another entity, and some arrived without visible support. The recommendation looked complete because those relationships were compressed into one fluent paragraph.
That compression is the object worth studying. It is where a list of partially supported facts becomes advice.