8000 Improve error experience when LoadCsv throws · Issue #5656 · dotnet/machinelearning · GitHub
[go: up one dir, main page]

Skip to content
Improve error experience when LoadCsv throws #5656
Open
@eerhardt

Description

@eerhardt

If you have .csv file that has some error in the data, for example:

Date | Daily minimum temperatures
1/1/1981 | 20.7
1/2/1981 | 17.9
1/3/1981 | 18.8
1/4/1981 | 14.6
1/5/1981 | 15.8
...
7/20/1982 | ?0.2
7/21/1982 | ?0.8

Here you get an exception:

System.FormatException: Input string was not in a correct format.
   at System.Number.ThrowOverflowOrFormatException(ParsingStatus status, TypeCode type)
   at System.String.System.IConvertible.ToSingle(IFormatProvider provider)
   at System.Convert.ChangeType(Object value, Type conversionType, IFormatProvider provider)
   at Microsoft.Data.Analysis.DataFrame.Append(IEnumerable`1 row, Boolean inPlace)
   at Microsoft.Data.Analysis.DataFrame.LoadCsv(Stream csvStream, Char separator, Boolean header, String[] columnNames, Type[] dataTypes, Int64 numberOfRowsToRead, Int32 guessRows, Boolean addIndexColumn, Encoding encoding)

Which isn't super helpful to tell you where or what was causing the problem. We should log a more helpful error in this situation so users know what is wrong.

We should also consider having a mode where things like this turn into N/A values, like Single.Nan.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      0