Unlock the Full Report

Enter your details to access premium pricing intelligence insights

Engineering · How-To

How to Structure Scraped Hotel & Flight Data (With Schema Examples)

By TravelScrape EngineeringUpdated June 2026Free · No paywall

Collecting travel data is only half the job. Structuring it well is what turns a pile of prices into something you can actually analyse. Here's how TravelScrape's engineers model it — with concrete schema examples you can copy.

Short answer

Store each observation as a timestamped record with a stable entity ID, the price, currency, dates, source and a capture time. Separate the slow-changing entity (the hotel or route) from the fast-changing observation (its price on a date). This keeps travel data clean, queryable and analytics-ready. TravelScrape delivers data already in this shape.

Why structure matters

Prices change constantly, so travel data is really a stream of observations over time, not a single snapshot. The most common beginner mistake is overwriting a hotel's price each time you collect it — which destroys the history that makes the data valuable. The fix is to model two things separately: the entity (a hotel, room type or flight route) and the observation (its price and availability at a specific moment).

Step 1 — Model the hotel entity

The entity rarely changes. Store it once and reference it by a stable ID:

{
  "hotel_id": "bk_123456",
  "name": "Sea View Hotel",
  "source": "booking.com",
  "city": "Goa",
  "country": "IN",
  "star_rating": 4,
  "latitude": 15.55,
  "longitude": 73.75
}

Step 2 — Model the price observation

The observation changes constantly. Store a new record every time you collect:

{
  "hotel_id": "bk_123456",
  "room_type": "Deluxe Double",
  "checkin": "2026-07-10",
  "checkout": "2026-07-11",
  "price": 6800,
  "currency": "INR",
  "available": true,
  "captured_at": "2026-06-01T09:30:00Z",
  "source": "booking.com"
}

The critical field is captured_at. Because you keep every observation, you can later chart how a price moved across days or weeks — the foundation of trend analysis and forecasting.

Step 3 — Do the same for flights

{
  "route": "BOM-DEL",
  "carrier": "AI",
  "depart_date": "2026-08-15",
  "price": 5400,
  "currency": "INR",
  "cabin": "economy",
  "captured_at": "2026-06-01T09:30:00Z",
  "source": "metasearch"
}

Step 4 — Match the same entity across sources

The same hotel appears on Booking.com, Expedia and Agoda under different IDs. To compare them, assign your own stable internal ID and map each source's identifier to it. Now a single query can show how one property is priced across every channel — the basis of rate parity monitoring.

Best-practice rules

Tip: Keep two layers — raw captures in one table, and a cleaned, de-duplicated layer on top. You can always re-process raw data if your logic changes; you can never recover what you threw away.

Putting it together

With entities, time-stamped observations, stable IDs and a raw layer, your travel data becomes genuinely useful: you can chart price trends, compare channels, detect promotions and feed a pricing engine — all from the same clean foundation. TravelScrape provides data already structured this way, validated and de-duplicated, as JSON, CSV or API, so you can load it straight into analytics. For the terms used here, see our glossary; for what to watch out for, see common scraping errors.

Want this data, not just the guide?

Get a free sample dataset from TravelScrape — hotel rates, flight prices or OTA listings for your market. Delivered by our engineering team, no commitment.