Time-stamps and R

I recently found out that dealing with time-stamps in R can be a real pain. On the surface everything works fine, we can call the Sys.time() command, take this object of class “POSIXct” or “POSIXt,” and transform into a numeric (i.e. number of second from midnight 01/01/1970 ) for arithmetic manipulation.

 > Sys.time() [1] "2011-09-03 18:18:03 EDT"
> class(Sys.time())
[1] "POSIXct" "POSIXt"
> as.numeric(Sys.time())
[1] 1315088290

Now all seems fine unless say your time-stamp is in some other format. Suppose it looks like “12/15/10 0:00.” From what I have seen R can’t really deal well with formats such as this one and besides parsing to coerce this into something R likes what has worked is the following. You could break up the time-stamp above into two pieces, the date and the time. The time R handles ok, the date is the problem. Luckily for this example I didn’t have to do this as the csv file contained a separate column for date and time, in addition to time-stamp. So we can transform the date into a format R likes via

<span class="Apple-style-span" style="font-family: Georgia, 'Times New Roman', 'Bitstream Charter', Times, serif; font-size: 13px; line-height: 19px; white-space: normal;">as.Date(gdata$Date[1:10], "%m/%d/%Y")</span></pre>
[1] "0010-12-15" "0010-12-15" "0010-12-15" "0010-12-15" "0010-12-15" "0010-12-15" "0010-12-15" "0010-12-15" "0010-12-15" "0010-12-15"

<span class="Apple-style-span" style="font-family: Consolas, Monaco, monospace; font-size: 12px; line-height: 18px; white-space: pre;">

– ok so far so good. Then we can take time column on it own and paste the two strings together.

paste(as.Date(gsendata$Date, "%m/%d/%Y") , gsendata$Time)[1:10]
 [1,] "0010-12-15 0:04:00"
 [2,] "0010-12-15 0:09:00"
 [3,] "0010-12-15 0:14:00"
 [4,] "0010-12-15 0:19:00"
 [5,] "0010-12-15 0:24:00"
 [6,] "0010-12-15 0:29:00"
 [7,] "0010-12-15 0:34:00"
 [8,] "0010-12-15 0:39:00"
 [9,] "0010-12-15 0:44:00"
[10,] "0010-12-15 0:49:00"</pre>

Now R still won’t like this if you try to transform these stamps to numeric format, so we first have to use strptime(), before doing so.

as.numeric(strptime(paste(as.Date(gdata$Date, "%m/%d/%Y") , gsendata$Time), "%Y-%m-%d %H:%M:%S"))[1:10]
 [1,] -61821514560
 [2,] -61821514260
 [3,] -61821513960
 [4,] -61821513660
 [5,] -61821513360
 [6,] -61821513060
 [7,] -61821512760
 [8,] -61821512460
 [9,] -61821512160</pre>

Now you probably notice that the numeric values are negative; that’s because as.Date() function has transformed “10” into the year 0010, so if you care about the actual date/time and not just time elapsed, you could add the appropriate numeric value of 2000 years to the entire vector.

There is probably (hopefully) a better way to deal with time formats that R does not like, other than going to python and using really nice packages like datetime, so if you know post a comment.


2 thoughts on “Time-stamps and R

  1. Boris says:

    Try as.POSIXct : zu <- as.POSIXct("12/15/10 0:00",format="%m/%d/%y %H:%M")

  2. notjustmath says:

    Ahh, nice Boris. thanks!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: