[Iplant-api-dev] loading iRODS data into the browser
Cornel Ghiban
ghiban at cshl.edu
Wed Aug 29 09:03:40 MST 2012
Hi Matt,
Here's the set of requests IGV makes:
"GET /project/ngs/tools/stream_data/20/175.bam.bai HTTP/1.1" 200 279000
"HEAD /project/ngs/tools/stream_data/20/175.bam HTTP/1.1" 200 -
"GET /project/ngs/tools/stream_data/20/175.bam HTTP/1.1" 200 1047552 "
In this las GET request, it sends a Range header to get only the 1st MB of
the file[ 'Range' => 'bytes=0-1023999']. My "stream_data" tool ignores the
header and sends the whole file.
Then, as you zoom in there's another request with ['Range' =>
'bytes=519-1024518'].
Moving on to the iPlant server, here's the response of a GET request of a
"world" shared file:
#-----------------------------------------------------------------
[cornel at greenvm1 tmp ]$ wget -S --no-check-certificate --header 'Range:
bytes=0-1023999'
https://foundation.iplantc.org/io-v1/io/download/ghiban/archive/jobs/job-3601-nth/tophat_out/accepted_hits_sorted.bam
HTTP request sent, awaiting response...
HTTP/1.1 200 OK
Date: Wed, 29 Aug 2012 15:44:32 GMT
Server: Noelios-Restlet-Engine/1.2.m1
Vary: Accept-Charset,Accept-Encoding,Accept-Language,Accept
Accept-Ranges: bytes
Access-Control-Allow-Origin: *
Content-Type: application/octet-stream
Connection: close
Length: unspecified [application/octet-stream]
Saving to: `accepted_hits_sorted.bam'
2012-08-29 07:43:06 (1.09 MB/s) - `accepted_hits_sorted.bam' saved [2760854]
#-----------------------------------------------------------------
It looks like it ignores the Range header.
Here's how the headers of a GET request to a static file look like:
#-----------------------------------------------------------------
$ wget -S --header 'Range: bytes=0-1023999'
http://pipeline-dev.dnalc.org/files/175.bam
HTTP/1.1 206 Partial Content
Date: Wed, 29 Aug 2012 11:49:45 GMT
Server: Apache/2.2.3 (CentOS)
Last-Modified: Tue, 28 Aug 2012 11:46:45 GMT
Accept-Ranges: bytes
Content-Length: 1024000
Content-Range: bytes 0-1023999/2436131
Connection: close
Content-Type: text/plain; charset=UTF-8
Thanks,
Cornel
On 8/29/2012 11:30 AM, Matthew Vaughn wrote:
> This is pretty interesting actually. I just grabbed the header from the IO service, returning that BAM file I linked earlier...
>
> HTTP/1.1 200 OK
> Date: Wed, 29 Aug 2012 15:22:48 GMT
> Server: Noelios-Restlet-Engine/1.2.m1
> Vary: Accept-Charset,Accept-Encoding,Accept-Language,Accept
> Accept-Ranges: bytes
> Access-Control-Allow-Origin: *
> Content-Type: application/octet-stream
> Connection: close
>
> It should actually be allowing and handling byte range requests! In fact, IGV is not complaining about errors until it tries to do a seek on the BAM file to display reads.
>
> On Aug 29, 2012, at 10:00 AM, Cornel Ghiban wrote:
>
>> Hi Matt,
>>
>> Thanks for letting us know about user "world" (I was sharing the files to a "dnasubway" user and then have this user stream the files directly to the browser).
>> But it still doesn't work since IGV insists on using the "range-byte" requests.
>>
>> Thanks,
>> Cornel
>>
>> On 8/28/2012 6:05 PM, Matthew Vaughn wrote:
>>> This is very close, as it presents a publicly available HTTP link for the BAM file, but it doesn't support byte-range fetch.
>>>
>>> # Update to your own path, credentials, etc
>>>
>>> # Make the BAM and BAI file readable to user 'world'
>>> curl -X POST -sku "vaughn" -d "username=world" -d "canRead=true" -d "canWrite:false" https://foundation.iplantcollaborative.org/io-v1/io/share/vaughn/testbam/bwa_s_6_BC12_sorted.bam
>>>
>>> curl -X POST -sku "vaughn" -d "username=world" -d "canRead=true" -d "canWrite:false" https://foundation.iplantcollaborative.org/io-v1/io/share/vaughn/testbam/bwa_s_6_BC12_sorted.bam.bai
>>>
>>> # Access un-authenticated via the /download endpoint
>>> curl -X GET -sk https://foundation.iplantc.org/io-v1/io/download/vaughn/testbam/bwa_s_6_BC12_sorted.bam
>>>
>>> On Aug 28, 2012, at 4:20 PM, Cornel Ghiban wrote:
>>>
>>>> Hi Roger,
>>>>
>>>> Well, I was thinking to save some time (and disk space) and stream the bam
>>>> files from iRODS to IGV via a perl script, but it looks like IGV requests
>>>> use Range headers and it wants slices of the BAM file. I don't think my perl
>>>> script will be able to handle this :)
>>>>
>>>> Thanks,
>>>> Cornel
>>>>
>>>> On 8/28/2012 5:15 PM, Barthelson, Roger A - (rogerab) wrote:
>>>>> Hi Cornel-
>>>>>
>>>>> I assume you mean that you have a running instance of IGV, and you want to
>>>>> move BAM files into it. Normally you would want to use icommands or
>>>>> similar to move the files into place before you start, but if that is not
>>>>> possible for what you are doing, then it seems like maybe you would need
>>>>> to have a separate thread run to move the files into place. I doubt there
>>>>> is a way of having, for example, IGV address iRods directly. Maybe there
>>>>> is somewhere, but I haven't heard of it. So, I think you have to move the
>>>>> files into place with a different thread and then open them from IGV. I'm
>>>>> not sure how that would work. I found this mysterious perl script
>>>>> recently, maybe it could help.
>>>>>
>>>>> Roger
>>>>>
>>>>>
>>>>> Roger Barthelson Ph.D.
>>>>> Bioinformatics Analyst
>>>>> iPlant Collaborative
>>>>> BIO5 Institute, University of Arizona
>>>>> Phone: 520-977-5249
>>>>> Email: rogerab at email.arizona.edu
>>>>> Web: http://www.iplantcollaborative.org/
>>>>> http://bio-it.arizona.edu/Roger_Barthelson_Home.html
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On 8/28/12 1:42 PM, "Cornel Ghiban" <ghiban at cshl.edu> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> Is there an easy way of sending some data to the user?
>>>>>> What I'd like to do is send some BAM files into IGV.
>>>>>>
>>>>>> ( I've tried using a proxy web-script, but for some reason IGV fails to
>>>>>> load
>>>>>> the data. If I save the data to disk and then manually load it into IGV
>>>>>> it
>>>>>> works fine. )
>>>>>>
>>>>>> Thanks,
>>>>>> Cornel
>>>>>> _______________________________________________
>>>>>> Iplant-api-dev mailing list
>>>>>> Iplant-api-dev at iplantcollaborative.org
>>>>>> http://mail.iplantcollaborative.org/mailman/listinfo/iplant-api-dev
>>>>>
>>>> _______________________________________________
>>>> Iplant-api-dev mailing list
>>>> Iplant-api-dev at iplantcollaborative.org
>>>> http://mail.iplantcollaborative.org/mailman/listinfo/iplant-api-dev
>>>
>>> --
>>> Matthew W. Vaughn, Ph.D.,
>>> Manager, Life Sciences Computing Group
>>> Texas Advanced Computing Center
>>> Austin, TX
>>> vaughn at tacc.utexas.edu | (949) 436-6642
>>>
>>>
>
> --
> Matthew W. Vaughn, Ph.D.,
> Manager, Life Sciences Computing Group
> Texas Advanced Computing Center
> Austin, TX
> vaughn at tacc.utexas.edu | (949) 436-6642
>
>
More information about the Iplant-api-dev
mailing list