"A big computer, a complex algorithm and a long time does not equal science." -- Robert Gentleman
Visualizzazione post con etichetta NA. Mostra tutti i post
Visualizzazione post con etichetta NA. Mostra tutti i post
lunedì 15 giugno 2009
Replacing 0 with NA - an evergreen from the list
This thread from the R-help list describe an evergreen tip that, at least once, is proved useful in R practice.
domenica 8 marzo 2009
Dealing with missing values
Two new quick tips from 'almost regular' contributor Jason:
Handling missing values in R can be tricky. Let's say you have a table
with missing values you'd like to read from disk. Reading in the table
with,
read.table( fileName )
might fail. If your table is properly formatted, then R can determine
what's a missing value by using the "sep" option in read.table:
read.table( fileName, sep="\t" )
This tells R that all my columns will be separated by TABS regardless of
whether there's data there or not. So, make sure that your file on disk
really is fully TAB separated: if there is a missing data point you must
have a TAB to tell R that this datum is missing and to move to the next
field for processing.
Lastly, don't forget the "header=T" option if you have a header line in
your file.
Here's the 2nd tip:
Some algorithms in R don't support missing (NA) values. If you have a
data.frame with missing values and quickly want the ROWS with any
missing data to be removed then try:
myData[rowSums(is.na(myData))==0, ]
To find NA values in your data you have to use the "is.na" function.
Handling missing values in R can be tricky. Let's say you have a table
with missing values you'd like to read from disk. Reading in the table
with,
read.table( fileName )
might fail. If your table is properly formatted, then R can determine
what's a missing value by using the "sep" option in read.table:
read.table( fileName, sep="\t" )
This tells R that all my columns will be separated by TABS regardless of
whether there's data there or not. So, make sure that your file on disk
really is fully TAB separated: if there is a missing data point you must
have a TAB to tell R that this datum is missing and to move to the next
field for processing.
Lastly, don't forget the "header=T" option if you have a header line in
your file.
Here's the 2nd tip:
Some algorithms in R don't support missing (NA) values. If you have a
data.frame with missing values and quickly want the ROWS with any
missing data to be removed then try:
myData[rowSums(is.na(myData))==0, ]
To find NA values in your data you have to use the "is.na" function.
Iscriviti a:
Commenti (Atom)