Global Peer-Reviewed Platform
Serving Researchers Since 2012

An Open, Low-Cost IOT Weather and Air-Quality Monitoring Node: Design, Methods and Evaluation

DOI : https://doi.org/10.5281/zenodo.18103624
Download Full-Text PDF Cite this Publication

Text Only Version

 

An Open, Low-Cost IoT Weather and Air-Quality Monitoring Node: Design, Methods, and Evaluation

Rafid Mohammed Khaleefah

Information and Communication Technology Centre, University of Basrah. Basrah, Iraq.

Abstract– This paper presents the design, weather and air-quality monitoring node built around BMP280, MQ-135). Telemetry is published over MQTT and dashboards support online visualization and bill-of-materials, firmware workflow, data schemas, method for accuracy, latency, availability, and energy the state-of-the-art by comparing it to recent IoT pipelines [1]-[7]. Field trials collected multi-day traces mean absolute error below typical low-cost sensor humidity (≤3 %RH), stable pressure estimation end-to-end latency of 1.1-1.9 s at 15-60 s sampling circuits, code, and cloud recipes for researchers and meteorological stations[8]-[12].

wireless connectivity and cloud platforms to deliver (QC), and near-real-time visualization. By streaming heterogeneous devices into a coherent data fabric and

ranging from simple persistence models to learned missing in the literature, however, is an end-to-end wiring, firmware state transitions, payload schema recovery. and quantifies the trade-offs among publishable KPIs.

and evaluating a full stack hardware, firmware, visualization – organized around five design principles: and affordability. We contribute a detailed

shareable [1],[3],[4] dashboards. We show that, with

humidity, and pressure measurements from low-cost

Micro-meteorological observations at the envelopes, and that MQTT’s session semantics deliver

national networks cannot support opening/closing small telemetry frames [1]. We position the prototype

scheduling outdoor events, and triaging heat-health [6],[8],[12], highlighting where our approach aligns

stations recommended by the World Meteorological telemetry; schema-first data).

maintain and are often deployed kilometers apart. This urban canyons and coastal zones. In response, on commodity sensors and embedded microcontrollers magnitude lower than reference systems [12].

single-investigator deployments, such as campus labs constrained and where audits by peer reviewers described here can be extended with particulate-matter

III. System Architecture and Methods

Hardware and assembly. The core is an

Device integration. Prior work demonstrates that

measurements. [5] deploys a multi-sensor suite in

portability and remote-area operation via cellular

sensors and validates a LoRaWAN backhaul in

component costs and reproducibility. The review in

Networking and cloud ingestion. MQTT and HTTP(S)

overhead, supports QoS levels and retained “last-will”

preferred for frequent small payloads. HTTP(S) is

cloud platforms like ThingSpeak but incurs greater

Cloud ingestion and QC. The back-end stores raw frames and a curated, QC’d series. QC applies (i) hard range filters; (ii) a Hampel outlier filter (window 9-11 samples, 3-3.5cr); (iii) step-test flags (e.g., fan proximity); (iv) resampling to a uniform cadence with forward fill for short gaps; and (v) derivation of heat index and dew point (Magnus formula). Basic availability and latency metrics are computed per hour and per day from device/server timestamps. Simple persistence and damped-trend baselines provide reference forecasts for future edge-modelling work [2].

Fig. 2. MQTT topic hierarchy used in this study, showing root/domain/device segments and two publishing options (JSON vs. individual streams), with QoS1 and retained status.

On the HTTP path, measurements are mapped to ThingSpeak channels/fields, mindful of rate limits (;::;15 s minimum per channel).

Fig. 4. MATLAB Analysis output used to show humidity as dew point & gauge visualization.

Fig. 3. ThingSpeak channel and field configuration, (site: Baghdad). Showing one of the creations of ThingSpeak channels and mapping of fields.

Both paths can be enabled simultaneously; a monotonic sequence and server timestamps allow reconciliation of duplicates. TLS is enforced for HTTPS; MQTT can run over TLS where a broker supports it; otherwise, we rely on WPA2 network security and the absence of sensitive personal data in payloads.

Calibration and co-location. We recommend a two-point humidity check (e.g., saturated salt solutions) and a single-point temperature offset (ice-water bath proximity) prior to deployment, followed by a 24-h co-location with a known-good station. Linear corrections (gain/offset) are persisted in non-volatile memory and versioned in metadata. Pressure is compared against a METAR reference with altitude correction. These practices reflect guidance and experience summarized across [5],[10],[12].

Evaluation plan. We quantify accuracy (MAE), latency (timestamp delta), packet delivery ratio (PDR), availability (connected fraction), and energy draw. We vary sampling cadence (15-60 s) and assess both indoor and outdoor contexts, annotating disturbances (door/window events, sun exposure) for interpretation. To support replication, we publish the message schema, QC thresholds, and plotting scripts alongside raw CSV exports.

Field definitions, units, and update intervals used in this study.

Field

Unit

Description

Update Interval

field1: Temperature

°C (also logged in

°F)

Ambient air temperature; converted to

°C for analytics

15-60 s

field2: Humidity

%RH

Relative humidity at sensor

location

15-60 s

 

Data sets and annotations. For each run, we archived device-side logs (boot banners, Wi-Fi RSSI, retries), server-side receive times, and QC’d series. Events such as window opening, fan activation, or sunlight exposure were annotated in a lab log to contextualize steps in the traces. Sampling cadences of 15, 30, and 60 s were tested for 2-6 h sessions, yielding thousands of samples per run.

Reference instrumentation. A calibrated hand-held thermo-hygrometer (±0.3 °C, ±2%RH) was used for spot checks; barometric pressure was cross-checked with airport METAR reports corrected for site elevation. While not a full traceable reference, this arrangement supports the practical MAE calculations reported below and mirrors co-location practice in

[5],[l 0].

A. Statistical Methods and Derived Indices

QC uses a Hampel filter with window w and threshold ka. Given series x_t, the median m_t and median absolute deviation MAD_t are computed within the window; points where lx_t-m_tl>k·l.4826·MAD_t are flagged. Dew point is estimated with the Magnus formula: Td = (b·y)/(a–y), = ln(RH/l00)+a·T/(b+T), with a=l7.62,b=243.12°C; heat index follows NOAA’s regression for T>27 °C and RH>40%. Latency is computed as server_receive_ts-device_ts; availability is the fraction of time since first sample with non-missing status heartbeats.

  1. Results

    Accuracy. Across three indoor and two outdoor sessions, temperature MAE versus the handheld reference ranged 0.4-0.6 °C after a single-point offset; relative humidity MAE was 2.0-3.0 %RH with a two-point adjustment; pressure bias remained within 1.0-1.5 hPa versus the METAR baseline. These values fall within expected datasheet envelopes for low-cost sensors when properly shielded and calibrated [5],[12].

    Latency and reliability. Median end-to-end latency (device timestamp to server receive) was 1.1-1.4 s for MQTT (QoS1) and 1.4-1.9 s for HTTPS POST at 30-

    60 s cadence on an uncongested Wi-Fi network. Packet Delivery Ratio exceeded 99.2% for MQTT and 98.5% for HTTPS. Reconnection after transient AP loss favored MQTT due to keep-alive and session semantics, HTTPS incurred TLS handshakes more frequently. These findings echo prior comparative studies [1].

    Table III. Summary of key quantitative metrics comparing MQTT and HTTPS.

    0.6

    3.0

    1.5

    1.4

    99.2

    99.0

    0.6

    3.0

    1.5

    1.9

    98.5

    99.0

    Notes: MAE computed against the handheld/reference observations; median latency measured from device timestamp to cloud persistence; PDR = delivered/published x I 00%; availability reflects end-to-end connectivity over the experiment window.

    Availability and behavior. The node sustained >99% connectivity over multi-hour runs. Outdoor diurnal cycles exhibited ~8-10 °C amplitude with humidity inversions (higher RH at night). Indoor perturbations (door opening, fan use) appeared as clear steps and transient micro-gust signatures in the humidity channel. Pressure traces were smooth, modulated by synoptic-scale changes, validating sensor stability.

    Alerts were implemented via PushBullet: when a derived metric exceeded a configured threshold, a MATLAB routine posted a notification to a device/channel, which immediately appeared on the paired smartphone. As shown in Fig. 6.

    Fig. 5. Plot of the weather channel data in an hourly average format for the last 24 hours in Baghdad (BGD).

    Fig. 6. Alert pipeline: QC’d telemetry triggers a MATLAB threshold evaluator, which posts to a PushBullet device/channel and raises an immediate smartphone notification (non-breach events are logged only).

    A. Error Budget Analysis

    Temperature error arises from sensor quantization, calibration residuals, and radiation bias inside the shield. Humidity error includes hysteresis and temperature-dependent bias; pressure error is dominated by absolute offset and weather-system variability. In our runs, the dominant contributors were radiation bias (outdoors) and RH hysteresis after rapid humidity steps. Practical mitigations include aspirated shields, slower sampling during stabilization, and periodic re-zeroing in a sealed bag with desiccant for the RH sensor.

  2. Discussion

    Measurement fidelity. The observed MAEs suggest that with modest screening and co-location, common low-cost sensors yield decision-quality meteorological data for many applications. Remaining biases reflect micro-environmental differences and radiation loading; we recommend an aspirated shield for outdoor deployments and redundant humidity sensing to detect drift [12].

    Protocol choice. For frequent, small payloads, MQTT is technically superior to HTTPS due to reduced header overhead and QoS/retained semantics; HTTPS remains attractive when integrating with purely REST-based platforms or when messages are infrequent and human-readable logs are desired [1]. Our dual-path design allows easy A/B measurement and graceful degradation.

    Analytics stack. A pragmatic QC pipeline (range checks, Hampel filtering, resampling) catches the majority of spurious readings while preserving sharp changes that carry physical meaning (e.g., step changes when a window opens). Edge prediction is feasible on ESP32-class devices, but we recommend starting with persistence/damped-trend baselines and migrating compute to a nearby SBC (e.g., Raspberry Pi) if model complexity increases [2],[7].

    Comparison to prior work. Our measurements align with [3],[4],[6] in demonstrating that low-cost, open designs can support actionable monitoring when paired with diligent QC and calibration. For PM sensing extensions, best practice includes frequent zero checks, humidity compensation, and periodic co-locations [10],[11] _ practices compatible with our pipeline but left for future work.

    1. Platform Comparison

      ESP8266 remains cost-efficient but has less RAM/CPU headroom than ESP32, which eases TLS and local analytics. Raspberry Pi-class SBCs are over-provisioned for a single station but ideal as nearby edge aggregators hosting the MQTT broker and dashboards. On the cloud side, ThingSpeak excels for rapid prototyping with MATLAB analytics; custom stacks (MQTT+InfluxDB/TimescaleDB+Grafana) suit multi-node fleets; Ubidots and similar Saas options trade configurability for speed of deployment.

      Table IV. Platform Comparison for IoT Weather Pipelines. It shows a qualitative comparison of integration effort, latency, and analytics.

      Platform Ease of ILatency Analytics

      Integration (qual.)

      ThingSpe

      ak

      High

      (REST/MQ

      TT; built-in MATLAB)

      Low

      (small- scale, near real- time)

      MATLAB

      Analysis

      AWS IoT

      Core

      Medium

      (IAM/servic es setup)

      Low-

      Moderate

      Kinesis/Ti

      mestream/S ageMaker

      Azure

      loT Hub

      Medium

      (enterprise­ oriented)

      Low­

      Moderate

      Azure

      Stream Analytics/ ML

  3. Recommendations

    • Prefer MQTT (QoSl) for routine telemetry; keep HTTPS as a fall-back path.
    • Enforce schema versioning and include calibration coefficients and valid-from metadata in every message.
    • Use shielded enclosures; for outdoors, adopt multi-plate or aspirated shields; avoid direct solar load.
    • Perform two-point RH and single-point °C checks before deployment; revisit co-location quarterly.
    • Budget for PM sensor aging, humidity correction, and periodic field checks if air-quality is added [10],[11].
    • Centralize metrics dashboards (latency, PDR, uptime) to catch regressions quickly.
    • Archive raw frames and QC’d series; publish CSV and analysis notebooks for transparency.

      chamber

      reference

      counter;

      humidity correction

      No

      Limited rural

      Add

      LoRaWAN

      deployment range

      LoRaWAN

      backhaul

      gateway;

      compare duty­

      cycle

      constraints

       

      1. Implementation Checklist

        • Assemble the node and verify sensor IDs;
        • Flash firmware;
        • Configure Wi-Fi, API keys, and MQTT topics;
        • Perform two-point RH and one-point °C checks;
        • Run a 24-h co-location;
        • Enable dual-path telemetry;
        • Review dashboards for latency/PDR outliers;
        • Archive raw/QC’d data and calibration coefficients;
        • Prepare a short deployment report with photos and site notes.
  4. Threats to Validity

    Internal validity threats include the short co-location periods and reliance on handheld references; although calibrated, they are not traceable standards. External validity is limited by testing under favorable Wi-Fi c?nditions and a single geographic site; results may differ under congested networks or extreme climates. Construct validity could be impacted by unmodeled airflow and radiation bias; the recommended aspirated shield should mitigate these effects in future work.

    Table V. Threats to Validity (Summary). Shows Threat categories, evidence, and mitigations

    summarized        from   the   Discussion.

  5. Risk Assessment Mitigation

    Operational risks include power loss, Wi-Fi outages, API key leakage, and sensor drift. Mitigations: buffer a few samples locally; implement exponential back-off; rotate keys; store credentials in a separate he der;. alarm on stale data; schedule quarterly cahbrat10n checks; and maintain a spare node for hot-swap during failures.

  6. Practical Lessons Learned

    (1) Shielding dominates temperature fidelity outdoors; cheap shields work indoors but underperform in full sun. (2) Dual-path telemetry simplifies A/B testing and helps diagnose backend issues; keep sequence numbers and timestamps consistent. (3) MQTT retain/last-will messages are invaluable for simple fleet health dashboards. (4) RH sensors exhibit hysteresis; slow ramp tests expose this behavior and guide calibration. (5) Document everything: wiring

    Internal

    validity

    Short co-locations;

    handheld reference not traceable

    Longer co­

    locations; borrow

    traceable instrument

    External

    validity

    Single site

    (Baghdad) and favourable Wi-Fi

    conditions

    Multi-site trials;

    congestion stress tests

    Construct

    validity

    Radiation bias and

    airflow effects on T/RH

    Aspirated shield;

    redundant RH probes; step tests

     

    Threat Description Category Evidence

    Mitigation

    ph tos, sketch versions, and channel mappings. reviewers and future maintainers will thank you about this step.

  7. Conclusion and Future Work

    We engineered and evaluated a reproducible IoT weather/air-quality node that delivers accurate measurements and robust telemetry using commodity components. The contributions include a full BOM firmware state machine and schema, dual-path telemetry, and a pragmatic QC/analytics stack. Results meet practical accuracy envelopes and corroborate findings across the literature [1]-[12]. Future work will extend the node with wind/rain sensors and PM

    Table VI. Study Limitations. Shows Limitations of the study with imract and proposed workarounds.      

    Limitation Impact Workaround/

    No

    Cannot relate

    Add

    wind/rain

    micro-climate

    anemometer +

    sensors in

    dynamics to

    tipping-bucket

    current

    wind/rain events

    rain gauge

    build

     

    Future fix

    No PM PM module not Co-locate with calibration validated against reference

    modules, integrate an edge-forecasting block (e.g., NARX/XGB) [7], add LoRaWAN for rural backhaul [6], and conduct multi-week co-locations against reference stations to refine calibration models. Future work could extend our node toward PWV estimation by adopting surface-based algorithms reported for BME280-equipped low-cost AWSs [9].

  8. Detailed Future Work Plan

    Phase 1 (Month 1-2): integrate PM sensor (SDS0l 1 or PMS7003), add humidity correction, and validate

    against a reference counter for 7-10 days. Phase 2 (Month 3-4): implement edge forecasting on an ESP32-S3 or Raspberry Pi Zero using persistence, ARIMA, and a compact NARX/XGBoost baseline; evaluate RMSE/MAE vs. naive baselines on rolling windows. Phase 3 (Month 5-6): extend backhaul with LoRaWAN for two remote nodes, compare coverage and duty-cycle implications, and perform a week-long outage-tolerance study. Phase 4 (Month 7): automate calibration reminders and dashboard alarms for stale data and abnormal drifts.

  9. Expanded Recommendations

    Document site metadata (GPS, elevation, shading, nearby heat sources); keep a per-node CHANGELOG; version firmware and schema jointly; prefer UTC everywhere; use idempotent HTTP endpoints or deduplicate on the server using seq+ts; visualize latency and PDR on the same time axis as temperature/humidity to spot coupling; treat dashboards as part of the scientific artifact, export figure PNG/SVG directly from code to avoid manual plotting errors; store raw data indefinitely and derive curated datasets programmatically to ensure provenance.

  10. Appendix B Algorithms and Pseudocode for QC Pipeline

    Function QC(series, w, k): for each channel c m {temp, rh, press}: resample to cadence t; compute rolling median m_t and MAD_t; for each t: if lx_t- m_tl > k·l.4826·MAD_t then flag; compute dew_point(T,RH) and heat_index(T,RH); return curated senes with flags.

    Parameter choices used in this study: window w=11 samples, k=3.0; tE{15,30,60}s. Appendix C – Firmware State Machine Pseudocode: state BOOT–tCONNECT (Wi-Fi, NTP)–tSAMPLE

    (read sensors, apply cal)–tENCODE (JSON)–tPUBLISH (MQTT QoSl; HTTPS

    POST)–tSLEEP( t-elapsed). On failure, exponential back-off with jitter; watchdog resets on stall.

  11. Reproducibility Statement

    We provide message schemas, QC thresholds, and plotting scripts. All parameters required to reproduce figures (window sizes, thresholds, sampling cadences) are fixed in the text to enable exact replication by reviewers. Any future revisions will bump the schema version and enumerate changes in a CHANGELOG.

  12. Ethical and Responsible Deployment

Weather and air-quality data can influence public behavior. Publish prominent caveats about low-cost sensor limitations, calibration dates, and intended use. Avoid implying regulatory equivalence with reference-grade stations. When sharing geolocated data, consider jittering coordinates or aggregating temporally/spatially to protect privacy of building occupants.

– References

  1. Fahim, M., El Mhouti, A., Boudaa, T., Jakimi, A., Modeling and implementation of a low-cost IoT-smart weather monitoring station and air quality assessment based on fuzzy inference model and MQTT protocol, Modeling Earth Systems and Environment, 2023. doi:10.1007/s40808-023-01701-w.
  2. i ler, B., Kaya, $. M., K1h9, F. R., Fog-Enabled Machine Leaming Approaches for Weather Prediction in IoT Systems: A Case Study, Sensors, 2025. doi:10.3390/s25134070.
  3. Michailidis, I., et al., An Arduino-Based, Portable Weather Monitoring System, Electronics, 2025. doi:10.3390/electronics14122330.
  4. Albuali, A., et al., Scalable Lightweight IoT-Based Smart Weather Measurement System, Sensors, 2023. doi:10.3390/s23125569.
  5. Costa Branco, P. J., et al., Low-Cost IoT-Based Sensor System: A Case Study on Harsh Environmental Monitoring, Sensors, 2021. doi:10.3390/s21010214.
  6. Johnston, S. J., et al., City Scale Particulate Matter Monitoring Using LoRaWAN Based Air Quality IoT Devices, Sensors, 2019. doi:10.3390/s19010209.
  7. Moursi, A. S., et al., An IoT enabled system for enhanced air quality monitoring and prediction on the edge, Complex & Intelligent Systems, 2021. doi:10.1007/s40747-021-00476-w.
  8. Nagarajaiah, H., et al., Low-Cost IoT Weather Prediction and Monitoring – A Review, Processes, 2024. doi:10.3390/pr12091961.
  9. Suparta, W., Warsita, A., Ircham, A low-cost development of automatic weather station based on Arduino…, IJEECS, 2021.

    doi:10.11591/ijeecs.v24.i2.pp744-753

  10. Tagliabue, L. C., et al., Development of IoT­ Based Particulate Matter Monitoring System for Smart Construction, IJERPH, 2021.

    doi:10.3390/ijerphl 82111510

  11. Kalia, P., Ansari, M. A., IOT based air quality and PM concentration monitoring system, Materials Today: Proceedings, 2020.

    doi:10.1016/j.matpr.2020.02.179

  12. Ioannou, K., et al., Low-Cost Automatic Weather Stations in the Internet of Things, Information, 2021. doi:10.3390/info12040146.