Icon Case Study ·🇮🇸 Iceland ·Travel Data Extraction ·12 min read

Iceland OTA Data Extraction —
7 Platforms, 270-Day Forward
Dive.is · Reykjavik Excursions · Arctic Adventures

How TravelScrape designed and delivered a fully automated weekly data pipeline for 7 leading Icelandic OTAs — extracting all tour details, schedules, availability and pricing across a rolling 270-day forward window — structured as JSON, delivered via FTP every Monday.

7 OTAs
Iceland platforms covered
All major Icelandic OTAs
270 days
Forward coverage window
Rolling — updated weekly
Weekly
Extraction frequency
JSON via FTP — Monday
100%
Mandatory fields captured
All attributes per tour
48hr
Pipeline go-live time
Scoping to first delivery
Project Overview

Systematic weekly extraction of Iceland's leading tour & activity OTAs — 270 days forward, every week

This case study covers Iceland's adventure tourism market and how TravelScrape solved OTA data extraction — served by a cluster of specialised OTA platforms — each offering unique tours, activities and experiences ranging from glacier hikes and Northern Lights expeditions to whale watching and diving in Silfra. A business intelligence client required consistent, structured, weekly data from 7 of these leading platforms to support pricing strategy, competitive benchmarking and demand forecasting.

The project requirement: weekly tour data extraction covering all tour listings, schedules, availability calendars and pricing data across a rolling 270-day forward window from 7 publicly accessible Icelandic OTA websites — on a weekly cadence — structured as JSON and delivered to the client's FTP endpoint on a fixed Monday schedule.

🇮🇸
Why Iceland tour & activity data extraction is technically complex
Iceland's tour market is intensely seasonal. Northern Lights tours run October–March. Midnight Sun tours run May–August. Each platform uses different availability calendars, session-based pricing, dynamic capacity logic and tour-specific scheduling rules. Extracting 270 days of forward data means capturing off-peak scheduling, seasonal pricing tiers, blackout dates, capacity constraints and real-time availability status — across 7 platforms each with distinct site architecture and anti-scraping measures.
Project Scope

Full scope — all in-scope requirements defined

Scope Summary — In-Scope Requirements🇮🇸 Iceland OTA Project
RequirementDetails
IndustryOTA — Tour & Activity
Client GeographyIceland
Target Platforms (7)Dive.isMountain Guides IcelandReykjavik ExcursionsTrollNice TravelBus TravelArctic Adventures
Access ModePublic access (no login required)
Location CoverageAll Iceland — Reykjavik, South Coast, Golden Circle, Snæfellsnes, Westfjords, North Iceland, East Iceland, Vatnajökull region
Forward Window270 days — rolling forward from each weekly extraction date
Scope of DataComprehensive extraction of all tour, activity and pricing information — all mandatory output attributes captured per tour per date per platform
Output FormatJSON — structured, nested, fully typed, field-level validated
Output DeliveryFTP — client-specified FTP endpoint, fixed folder structure per weekly run
FrequencyWeekly — every Monday, full refresh across all 7 platforms
Target Platforms

7 Iceland OTA platforms — each with unique structure & data challenges

Each of the 7 target platforms has a distinct site architecture, booking system and data schema. TravelScrape built a dedicated extraction module per platform and normalised all 7 into a single unified JSON output schema.

1
Dive.is
dive.is
DivingSnorkellingSilfra
Session-based availability with instructor capacity limits. Silfra snorkelling and diving with strict slot-based scheduling and equipment add-on pricing — requires headless browser to capture dynamic slot availability.
2
Mountain Guides Iceland
mountainguides.is
GlacierHikingIce Climbing
Glacier hiking, ice climbing and mountain expeditions. Dynamic group-size-based pricing. Seasonal tour availability changes weekly — Vatnajökull routes differ significantly by month.
3
Reykjavik Excursions
re.is
Day ToursBus ToursTransfers
Largest Icelandic OTA — 100+ tours. Complex pricing matrix with early-bird, group and seasonal discounts layered on each tour. Multi-departure time slots require per-slot extraction.
4
Troll
troll.is
Northern LightsWhale WatchingPuffins
Weather-dependent tours with dynamic cancellation windows. Northern Lights availability shows "weather guarantee" status per date — a non-standard availability field requiring custom extraction logic.
5
Nice Travel
nicetravel.is
Golden CircleSouth CoastSmall Group
Premium small-group tours. Calendar fills 4–6 weeks ahead during peak. Multiple add-on options per tour (lunch, photography, private guide) increase field complexity per record.
6
Bus Travel Iceland
bustravel.is
Scheduled BusExcursionsRing Road
Scheduled bus excursions with fixed timetables. Seat-based availability across 30+ routes. Pricing varies by pick-up location and booking window — complex multi-departure structure per extraction.
7
Arctic Adventures
adventures.is
AdventureMulti-DaySuper Jeep
High-value adventure tours with deposit-based booking. Availability calendar shows slots 270+ days ahead with seasonal pricing bands — one of the most data-rich platforms in the project scope.
Mandatory Output Fields

All mandatory attributes — captured per tour, per date, per platform

Every extracted tour record contains the full set of mandatory fields, populated and validated before delivery. Missing mandatory fields trigger extraction retry — no partial records are ever delivered.

Icon Tour Identification
Tour ID (platform-native)
Tour Name / Title
Tour Category / Type
Sub-Category / Activity
Platform / Source OTA
Tour URL (canonical)
Extraction Timestamp
Icon Schedule & Availability
Available Date
Departure Time(s)
Duration (hours / days)
Availability Status
Remaining Slots / Capacity
Booking Cut-off Time
Cancellation Policy
Icon Pricing & Rates
Adult Price (base)
Child Price (if applicable)
Group Rate / Discount
Early Bird Price
Currency (ISK / EUR / USD)
Add-on / Optional Extras
Price Tier / Category
Icon Location & Logistics
Departure Point
Meeting Point / Address
Pick-up Availability (Y/N)
Pick-up Locations (list)
Drop-off Point
Geographic Region
GPS Coordinates
Icon Tour Details
Short Description
Full Description
Inclusions
Exclusions
Requirements / Age Limits
Difficulty Level
Language(s) Available
Icon Ratings & Media
Review Score (platform)
Number of Reviews
Review Summary / Tags
Primary Image URL
Gallery Image URLs
Badge / Award Tags
Featured / Promoted Status

Sample JSON output — single tour record

Sample Output — Arctic Adventures / Super Jeep Highland TourJSON · FTP Delivery
{
  "platform": "Arctic Adventures",
  "platform_url": "adventures.is",
  "extracted_at": "2025-01-13T06:00:00Z",
  "tour": {
    "id": "AA-JEEP-HIGHLAND-001",
    "name": "Super Jeep Highland Adventure — Landmannalaugar",
    "category": "Adventure / Super Jeep",
    "sub_category": "Highland Day Tour",
    "url": "https://adventures.is/iceland/day-tours/super-jeep-highland",
    "duration_hours": 12,
    "difficulty": "Moderate",
    "languages": ["English", "German"],
    "review_score": 4.9,
    "review_count": 284,
    "badge": "Top Seller",
    "featured": true
  },
  "location": {
    "departure": "Reykjavik — BSI Bus Terminal",
    "region": "Highland Iceland",
    "pickup_available": true,
    "pickup_locations": ["Reykjavik Hotels", "Keflavik Airport"],
    "gps": {"lat": 63.9919, "lng": -22.7028}
  },
  "availability": {
    "date": "2025-06-21",
    "departure_time": "07:30",
    "status": "Available",
    "slots_remaining": 6,
    "total_capacity": 12,
    "booking_cutoff": "2025-06-20T23:59:00Z",
    "cancellation_policy": "Free cancellation up to 48 hours before"
  },
  "pricing": {
    "currency": "ISK",
    "adult_price": 34900,
    "child_price": 17450,
    "group_discount": "10% for 8+ persons",
    "early_bird_price": 31900,
    "early_bird_condition": "Booked 30+ days in advance",
    "add_ons": [
      {"name": "Lunch Pack", "price": 2500},
      {"name": "Private Guide Upgrade", "price": 15000}
    ]
  },
  "inclusions": ["Super Jeep transport", "Professional guide", "All entrance fees"],
  "exclusions": ["Meals", "Personal travel insurance"],
  "images": {
    "primary": "https://adventures.is/media/tours/super-jeep-highland-hero.jpg",
    "gallery": ["...image-2.jpg", "...image-3.jpg"]
  },
  "meta": {
    "forward_window_days": 270,
    "extraction_run": "2025-W03"
  }
}
Technical Challenges

What made this project complex — and how we solved it

Icon
Key technical challenges across 7 platforms
270-day forward calendar extraction: Each platform renders availability calendars differently — some via JavaScript, others via AJAX API calls. Capturing 270 days of forward data required platform-specific calendar navigation logic for all 7 sites, including handling months that don't load until the user navigates to them.
Dynamic and layered pricing logic: Platforms like Reykjavik Excursions and Arctic Adventures apply multiple pricing layers simultaneously — base rate, early-bird discount, group rate, seasonal surcharge and add-on pricing. Capturing all pricing dimensions per tour per date required structured multi-field extraction, not just headline price.
Session-based booking flows: Dive.is and Mountain Guides use session-based booking flows that initialise slot-specific pricing and availability only after a date is selected — requiring headless browser automation to simulate user interaction before pricing data becomes visible in the DOM.
Icon
TravelScrape solutions applied
Dedicated extraction module per platform: Built a separate extraction engine per platform tuned to each site's exact DOM structure, JavaScript rendering behaviour, calendar navigation pattern and session requirements.
Headless browser for session-based platforms: Dive.is and Mountain Guides handled with headless Chromium — simulating date selection to capture dynamically rendered slot pricing and availability before extraction.
Multi-layer pricing extractor: For Reykjavik Excursions and Arctic Adventures, a structured pricing parser captures all pricing variants (base, early-bird, group, add-on) per tour per departure date in a single extraction pass.
Technical Pipeline

End-to-end weekly extraction pipeline — source to FTP

Icon
7 OTA Sources
Public websites — no login
Icon
Extraction Engine
Per-platform parsers + headless browser
Icon
270-Day Crawler
Rolling calendar — all dates per tour
Icon
Normalisation
7 schemas → 1 unified JSON
Icon
Validation
All mandatory fields verified
Icon
JSON Output
Structured, typed, nested
Icon
FTP Delivery
Client FTP — Monday 6am UTC

Week 1 execution timeline — scoping to delivery

Day 1 · Scoping & Platform Audit
Detailed audit of all 7 OTA platforms — structure, pricing model, calendar behaviour
Mapped DOM structure, JavaScript rendering, pricing layer logic and calendar navigation for each platform. Identified Dive.is, Mountain Guides and Troll as requiring headless browser — Reykjavik Excursions, Arctic Adventures, Nice Travel and Bus Travel requiring enhanced proxy configuration and multi-layer pricing extraction.
✓ All 7 platforms fully mapped
Day 2 (Morning) · Build
Platform-specific extraction modules built and tested per site
Built dedicated extraction engine per platform. Configured headless Chromium for session-based sites. Built multi-layer pricing parser for Reykjavik Excursions and Arctic Adventures. Configured 270-day calendar crawl logic — tested on Troll (weather-dependent availability edge cases) and Arctic Adventures (deposit-based booking display).
✓ All 7 extraction modules live and tested
Day 2 (Evening) · First Extraction Run
First complete 270-day extraction across all 7 platforms
Full extraction run completed — all 7 platforms, 270-day forward window. 14,800+ tour-date records extracted. Normalisation layer applied. Field-level validation passed — 100% mandatory fields populated across all records. JSON files generated per platform.
✓ 14,800+ records · 100% fields · 0 partial records
Day 2 — 48hrs from scoping · First Delivery
JSON files delivered to client FTP endpoint — all 7 platforms
All platform JSON files pushed to client FTP endpoint within 48 hours of project kickoff. Client confirmed receipt and validated field completeness against mandatory field checklist. Weekly automated schedule activated — every Monday extraction window: Sunday 23:00 → Monday 05:00 UTC, delivery by 06:00 UTC.
✓ FTP delivery confirmed · Weekly automation active
Delivery Specification

Weekly JSON delivery — format, structure and FTP specification

Icon
Output Format — JSON
One JSON file per platform per weekly run
Nested structure: platform → tour → date → pricing
All fields typed — string, integer, boolean, array
ISO 8601 timestamps throughout
UTF-8 encoding — supports Icelandic characters (þ, ð, æ, ö)
File naming: {platform}_{YYYYMMDD}.json
Icon
Delivery Mode — FTP
Delivered to client-specified FTP endpoint
Folder structure: /iceland-ota/{YYYY-WW}/
Delivery window: Monday 06:00–08:00 UTC
Confirmation email after each successful push
3 automated retry attempts on FTP push failure
12 weeks of prior deliveries archived
Icon
Frequency — Weekly Full Refresh
Full re-extraction of all 7 platforms every week
270-day forward window recalculated from each Monday
Full dataset every run — no incremental / delta only
Avg extraction duration: 4.5 hours (all 7 platforms)
Extraction window: Sunday 23:00 → Monday 05:00 UTC
Manual re-run available on request — same-day
Icon
Quality Assurance
Field-level validation — no partial records delivered
Record count check vs prior week — >10% drop triggers alert
Price range sanity check per platform — anomaly detection
Availability status cross-check across tour dates
Schema version control — field changes notified 2 weeks ahead
QA report available on request per delivery
Results & Impact

What the client achieved with weekly Iceland OTA data

The structured weekly dataset enabled the client's business intelligence team to build pricing models, competitive dashboards and demand forecasting tools that were not possible before consistent, normalised data from all 7 platforms was available in one place.

Icon
48hr
From scoping to first FTP delivery — all 7 platforms, 270-day forward window
Icon
14,800+
Tour-date records extracted per weekly run across all 7 Iceland OTA platforms
Icon
100%
Mandatory field coverage — every delivered record fully populated, no partial records
Icon
270 days
Forward pricing and availability visibility — rolling, refreshed every Monday
Icon
52 weeks
Continuous delivery — zero missed weekly extractions across 12 months of operation
Icon
7 / 7
All target platforms delivering at 100% — no platform-level extraction failures

"Before TravelScrape, our team spent 6+ hours per week manually checking pricing across 3–4 Iceland platforms — getting inconsistent, non-comparable data with no structured format. Now we receive all 7 platforms, fully structured, every Monday morning. Our pricing model runs automatically on fresh data. We went from weekly manual effort to a fully automated competitive intelligence pipeline."

— Business Intelligence Lead, Client Organisation

What the weekly data enables downstream

Icon
Business intelligence use cases unlocked by weekly Iceland OTA data
Competitive pricing dashboard: Real-time view of all 7 platforms' pricing per tour category per date — Arctic Adventures vs Mountain Guides glacier tour pricing, Reykjavik Excursions vs Nice Travel Golden Circle pricing, side by side, every week.
270-day demand forecasting: Availability tightening patterns across platforms signal demand peaks 4–8 weeks ahead — identifying high-demand dates before they fill, enabling pricing strategy decisions well in advance of the actual travel dates.
Seasonal pricing trend analysis: 12+ months of weekly data reveals exact pricing curves for Northern Lights season (Oct–Mar) and Midnight Sun season (May–Aug) — by platform, by tour type, by geographic region of Iceland.
New tour detection: Weekly comparison automatically identifies new tours launched on any of the 7 platforms — competitor product intelligence delivered without any manual monitoring.
Availability gap analysis: Identifies dates where 5+ platforms show "Sold Out" or "Limited" — signals undersupplied dates in the Iceland market where pricing headroom exists for own product or partner offerings.

Need OTA data extraction for your market?

Tell us your target platforms, coverage window and delivery format — we scope in 24 hours and deliver in 48. No contract required.