Collecting consolidating and redistribution Rick Benson The DMCs Import of Streaming Real Time Data Processes 8 0 slarchive processes retrieve data from SeedLink Servers 20 orb2orb processes retrieve data from Antelope systems ID: 515479
Download Presentation The PPT/PDF document "Streaming data at the IRIS DMC" is the property of its rightful owner. Permission is granted to download and print the materials on this web site for personal, non-commercial use only, and to display it on your personal computer provided you do not modify the materials and that you retain all copyright notices contained in the materials. By downloading content from our website, you accept the terms of this agreement.
Slide1
Streaming data at the IRIS DMC
Collecting, consolidating and re-distribution
Rick Benson Slide2
The DMC’s Import of Streaming Real Time Data: Processes
8
0 slarchive processes retrieve data from SeedLink Servers
20 orb2orb processes retrieve data from Antelope systems
30 ew2mseed processes retrieve data from Earthworm
WaveServers
5
RRPServer
processes receive data from Edge RRP feeds
An Earthworm system imports data from other earthworm,
reftek
, and
guralp
systems
A
Nanometrics
ApolloServer
imports data from
N
anometrics
systems Slide3
Real-Time Protocols Involved
Ways to Improve Configuration Correctness and Data Integrity:
Reduce
the use of protocols that require detailed
config Ew2mseedNanometrics ApolloReftek rtpdIncrease the use of protocols that deliver native miniSEEDRingserver/slarchiveNEIC rrpSlide4
Real-Time Stations in this region:
http://
www.iris.edu
/
gmap/_REALTIME?minlat=-10&maxlat=50&minlon=61&maxlon=150Slide5
REAL TIME DATA STATUS
Buffer of Uniform Data (BUD)
There are
177 networks
currently submitting data to the DMC, and 4 are Partially or Fully RestrictedThere are really 4 “BUDS”Main BUD with 118 networks2,233 stations, 16,496 channelsBUD Restricted (AF, EC, YN)65 stations, 235 channelsUSRA with 9 networks (AZ, C, CI, MC, N4, PB, TA, US, UU)619 stations, 17,827 channels- TA alone has 55,655 channels
USRA Restricted (XO Flex Array)
50 stations, 420 channels
2918 STATIONS with
~107,927
CHANNELS………..and steadily increasing.Slide6
Purpose of collecting streaming data
Efficiency and automation of the archiving process
Much easier than infrequent data delivery
Rapid data availability for users
Batch requests, event-windowed and streamingAllows automated, near real-time quality controlRoutine measurements in near real-timeAccess to data not available otherwiseFor some data contributors real-time is the preferred or only optionSlide7
Streaming data flow to the DMC archive
BUD
Buffer of
Uniform Data
ARCHIVEBATS
DCC
DCC
DCC
DCC
…
Data holdings:
Seconds to 8 weeks
behind current time*
12-24 hours after
arrival
Data holdings:
24 hours to
40+ yearsSlide8
Buffer of Uniform Data, the ins and outs
BUD
Buffer of
Uniform Data
EarthwormSeedLink
RRP
Reftek
Guralp
Antelope
CD 1.x
Mini-SEED
Wilber3
www.iris.edu/wilber3/
DMC Web Services
Service.iris.edu
SeedLink Export
www.iris.edu/data/dmc-seedlink.htm
NanometricsSlide9
So that’s how data gets
INTO the DMC.
Now, how do we
EXPORT
data in real-time?Slide10
The DMC’s SeedLink export service
SeedLink server, publically accessibleHost:
rtserve.iris.washington.edu
Port: 18000
All open data in the BUD is available via SeedLink with minimal added latency. Median latency: 40 Hz => 12 seconds, 1 Hz => 160 secondsSlide11
The SeedLink protocol can be summarized as a simple, ASCII-based, data selection phase followed by the streaming of data packets from the server. SeedLink packets are composed of a small header followed by a 512-byte Mini-SEED record (data only SEED). The negotiation phase allows the client to request only specified data from the server for each selected data stream. A data stream is defined by a network and station code pair.
By utilizing sequence numbers for each packet in a data stream the SeedLink protocol allows for connections to be resumed, eliminating most data gaps. The ability to resume data streams is primarily dependant on how much data, time-wise, the remote SeedLink has in its buffer.
Special, out-of-band packets created by a
seedlink
server and recognized by libslink are used to communicate server details to clients and to implement keep-alive packet swapping. These special INFO packets are XML formatted data embedded in Mini-SEED comment records. The protocol allows for two different modes of data transmission, uni-station and multi-station modes. Uni-station mode operates by transmitting a single data stream (data from a single station) through one network connection. In this mode the data stream does not need to be specified by the client as it is implied by the internet address and port. Multi-station mode operates by transmitting multiplexed data streams (data from multiple stations) through a single network connection. Almost all connections are negotiated as multi-station, even if only a single station is requested; uni-station mode, for most publicly accessible servers is deprecated.
SeedLink was originally created as the transport layer for the SeisComP package developed by GEOFON.
Protocol DetailsSlide12
IRIS Supported SeedLink clients:
http://www.iris.edu/data/dmc-seedlink.htm
Slide13
SeedLink export data flow
BUD
512-byte
Mini-SEED
ScannerScanner
Scanner
Scanner
…
Ringserver
~8 hours of data
SeisComP
slarchive
slink2orb
slink2ew
NAQS
…
2,918 stations
33,000+ channelsSlide14
Snapshot of SL client connections: Q1-2014
An average of 600-700 connections active at all times
Over 65 gigabytes per day stream outSlide15
The IRIS Turnkey SeedLink Server:
Called “Ringserver”Slide16
Ringserver: a stand alone SeedLink server
The IRIS DMC’s SeedLink implementation is being released to promote data exchange. Any data center creating 512-byte Mini-SEED records can have a SeedLink server.
Source code
:
https://seiscode.iris.washington.edu/projects/ringserver SeedLink server configuration instructions:https://seiscode.iris.washington.edu/projects/ringserver/wiki/How_to_configure_ringserver_as_a_SeedLink_streaming_serverSlide17
Highlights of
ringserver implementation
Stateful and self-correcting operation
Extremely scalable threaded architecture
Ring buffer is First In First Out (FIFO)Buffer size & number of clients limited only by hardwareDynamic configuration file optionsControl access by IP address, limit access at stream levelSeedLink 3.1 plus Network and Station wildcardingComprehensive transfer loggingRuns on Linux, Mac OSX and Solaris (32 and 64-bit )Integrated MiniSEED file system scannerSlide18
Ringserver related software
dalitool – General purpose ringserver query
Report server ID, version, general status, transfer rate
List connected clients and associated stats
Data stream inspectiondali2liss – Create LISS servers from ringserver streamsslink2dali – Send SeedLink streams to ringserver (beta)mseedscan2dali – Alternate Mini-SEED scannerSlide19
Ringserver conclusions
Scales to huge numbers of stations and channels
Has run uninterrupted at the IRIS DMC for many months
Dynamically reconfigurable to limit downtime
Tracks and logs comprehensive user statisticsringserver created after attempts at using other systemsSlide20
Submitting Non-Real Time Data to IRIS
miniseed2dmc
: A dedicated program for submitting batches of Mini-SEED data directly to the IRIS DMC.
Robustly transmits data to the DMC, tolerant of network disconnects and transmission restarts
Designed to transmit very large data setsReports transmission summaryDataless SEED metadata must be supplied separatelyDistributed as source code, a C compiler is required
Coordination with the IRIS DMC is required before submitting data.Slide21
Sources of Data: Specific SolutionsSlide22
IRIS DMC Earthworm: Data to Archive
As part of our Earthworm system, the IRIS DMC
runs an
ew2ringserver=>
ringserver=>SeedLink server and an slarchive process internally to write data imported by our earthworm into our standard archival of miniSEED.We suggest that, where possible, network operators running earthworm also run these processes to best serve data to the DMC.Slide23
Network Operators Can Run An
Integrated Earthworm to SeedLink Server using ‘Ringserver’
Serves buffered Earthworm data to the DMC and and any other data clients via the widely used standard SeedLink protocol (
tcp
, miniSEED)Can be controlled with earthworm startstop and write log files to earthworm log file areaConvenient way to get miniSEED data out of earthworm for local use tooSlide24
It works like this:
ew2ringserver->ringserver->SL clients
ew2ringserver is available in earthworm v7.7 and is a standard earthworm module. Ew2ringserver exports data to a
ringserver
.ringserver is a stand alone C program that runs on unix/linux* and serves data via the seedlink protocol. Standard SL (SeedLink) clients can access ringserver data. ringserver has features that allow it to easily integrate with earthworm
Detailed instructions to run ew2ringserver->
ringserver
are available at:
https
://seiscode.iris.washington.edu/projects/ew2ringserver/
wiki/A_SeedLink_server_for_Earthworm
*If
you are running earthworm on Windows but have access to a
unix
system, you can still run ew2ringserver to export data to a
ringserver
running on the
unix
system. However, the
ringserver will be running outside earthworm startstop control so is less well integrated.
ISlide25
To Run a Ringserver…
D
ownload, compile, and place the resulting executable
ringserver
in the earthworm user’s PATH. Get ringserver here:https://seiscode.iris.washington.edu/projects/ringserverPrior to running ringserver, you will need to create a directory that will be used by ringserver as the buffer area for dataIf needed open a hole in your firewall to allow data client to open the seedlink port (defined in the ringserver.d file, typically 18000)Slide26
Ew2ringserver
Add an ew2ringserver instance to your earthworm system in the standard earthworm module way.
In the ew2ringserver.d file, the
ringserver’s
IP address and data input port (typically port 16000) is specified by this parameter: RSAddress localhost:16000Slide27
To Integrate Ringserver with Earthworm
You Need to:
Put a
ringserver.d
file in your earthworm params dir like:---RingDirectory /data/my_ringserver_dataDataLinkPort 16000SeedLinkPort 18000ServerID “MySeedlinkServer”---
Put an entry in the
startstop.d
file like:
…
Process
"
ringserver
/
my_ew_run_dir/
params/ringserver.d
-STDERR"
Class/Priority TS 0
…Slide28
Useful
Auxilary Programs to Monitor a Ringserver Seedlink Server
s
linktool
– use to to confirm that a seedlink server is up, that you can connect to it, and that it is serving data.dalitool – use to view clients connected to a ringserver and moreslarchive – use to download data from a SeedLink serverSlide29
Use
dalitool to monitor a ringserver
Specific to
ringserver
, not for generic seedlink server Dalitool can be used to inspect currently connected clients of a ringserverSource code for dalitool can be downloaded from:http://www.iris.edu/pub/programs/ringserver/Slide30
View clients of a
ringserver using dalitool:
dalitool
–ff –C
your.ringserver.ip.address:16000…ops1.iris.washington.edu [192.168.166.67:56328] [SeedLink] SeedLink Client 2014-05-21 20:58:22.983494 Packet 1598812 (2014-05-21 20:45:39.069500) Lag 0%, 1.8 seconds TX 185 packets 0.0 packets/sec 94720 bytes 0.0 bytes/sec RX 0 packets 0.0 packets/sec 0 bytes 0.0 bytes/sec Stream count: 46
Match: ^IU_.*_.*$
Reject:
…
The above example connection displayed by
dalitool
is from running the command ‘
slinktool
-p -S "IU_*" rtserve:
18000’Slide31
Use
slinktool to query any seedlink server
slinktool
–h for help/usage
slinktool –Q localhost:18000 – for a list of streams and time spans of data currently available on the SeedLink server running on the local machine, port 18000Slinktool –L localhost:18000 – for a list of stations served by the SeedLink server running on the local machine, port 18000Slide32
slarchive
Slarchive can be used to retrieve data from a SeedLink server, write it to disk specify “stream selection”, etc
-S streams Define a stream list for multi-station mode
'stream' is in NET_STA format, for example:
-S "IU_KONO:BHE BHN,GE_WLF,MN_AQU:HH?.D"Slarchive –h for help/usageDownload slarchive from here:http://www.iris.edu/dms/nodes/dmc/software/downloads/slarchive/Slide33
The DMC’s
Export of Streaming Real Time Data
We Use Ringserver as our SeedLink Server
Typically there are approx 650+ connections to our
SeedLink server at all timesReal time data is also available from Web services and Breq_fast Real time data is used for some of the IRIS DMC’s data productsSlide34
The DMC runs a publicly accessible SeedLink server on the following
host and port:host:
rtserve.iris.washington.edu port
: 18000 (default SeedLink port)
All open data that the DMC receives in real-time is available via this SeedLink server.Data arriving at the DMC more than 48 hours behind real-time are not exported via SeedLink.Usage Restrictions: Users are welcome to any data available via the server as long as client actions do not inhibit our capability to deliver data to other users.To view all currently available real-time stations:http://www.iris.edu/mda/_REALTIMEhttp://www.iris.edu/gmap/_REALTIME Currently 2918 stations (On July 3, 2014) and >32,647
chans
Real-Time Data from IRIS DMCSlide35
To improve communication about real time data issues, the DMC manages a real time mailing list. When problems with real time data flow are observed, messages will be posted to this list as the problems are discovered. Solutions to the problems or information regarding resumption of real time data feeds will also be posted to this list.
You can subscribe to this list at
http://www.iris.washington.edu/mailman/listinfo/iris-rtfeeds
Real-Time mailing list