jeudi 17 avril 2014

C# remplacement de champs de tableau d'un autre tableau - Stack Overflow


Hey guys just had a quick question for you. I have a small program that reads in a CSV file which contains a report delimited by commas. Within the report, one of the fields is a date, which I convert to a date/time, and only pull information that is from within a certain time frame. Here is my problem though: the report is actually a list of jobs that run on a certain system. Some of the job names however, contain commas on their own. This means that an excel output report is extremely messy because if a job name has a comma the job name will be divided between 2 cells. I am a bit new to programming so the only way I could think of solving this was checking to see how many fields ended up in my array delimited by commas. And if it was greater than the normal, I'd concatinate two of the fields which I know will be the job name. However, the issue is that if a job name contains 2 commas, this will not work because it is only set up to deal with 1 extra comma in the data.


I should add, the csv report that I read in is generated by another application where I can't control how it is delimited. Otherwise, I would change it to pipes or something of the sort.


Any thoughts? Below is the portion of code that deals with it:


StreamReader SR = new StreamReader(inputFile);
StreamWriter SW = new StreamWriter(outputFile);
string records;
//read headers from first line
string headers = records = SR.ReadLine();
SW.WriteLine(headers);
DateTime YesterdayAM = Convert.ToDateTime(DateTime.Now.AddDays(-1).ToShortDateString() + " 05:00:00 AM");
while ((records = SR.ReadLine()) != null)
{
if (records.Trim().Length > 0)
{
string daterecord = GetDateTimeFromStringArray(records);
if (daterecord.Length > 0)
{
DateTime recordDate = Convert.ToDateTime(daterecord);
if (recordDate >= YesterdayAM)
{
string[] checkfields = records.Split(',');
if (checkfields.Length > 13)
{
string[] replacefields = { checkfields[0], checkfields[1] + " " + checkfields[2], checkfields[3], checkfields[4], checkfields[5], checkfields[6], checkfields[7], checkfields[8], checkfields[9], checkfields[10], checkfields[11], checkfields[12] };
for (int i = 0; i < replacefields.Length; i++)
{
SW.Write(replacefields[i] + ",");
}
SW.Write(Environment.NewLine);
}
else
{
SW.WriteLine(records);
}
}
}
}
}



It's a bit hacky to do it this way, but if you can't fix the source and you know that extra commas will only appear in the one field you could do something like this:


            string[] checkfields = records.Split(',');
while (checkfields.Length > 13)
{
// concat [1] & [2] into a new array
checkfields = checkfields.Take(1)
.Concat(new string[] { string.Join("", checkfields.Skip(1).Take(2).ToArray()) })
.Concat(checkfields.Skip(3)).ToArray();
} // if it's still too long it will loop again

Or better yet:


            string[] checkfields = records.Split(',');
int extraFields = checkfields.Length - 13;
if (extraFields > 0)
{
// concat fields 1....1 + extraFields
checkfields = checkfields.Take(1)
.Concat(new string[] { string.Join("", checkfields.Skip(1).Take(extraFields).ToArray()) })
.Concat(checkfields.Skip(extraFields + 1)).ToArray();
} // avoids looping by doing it all in one go

Note: the linq statements are untested and might not be the absolute most efficient way to do it. Also all the "magic" numbers should probably be replaced with constants for maintainability.



Hey guys just had a quick question for you. I have a small program that reads in a CSV file which contains a report delimited by commas. Within the report, one of the fields is a date, which I convert to a date/time, and only pull information that is from within a certain time frame. Here is my problem though: the report is actually a list of jobs that run on a certain system. Some of the job names however, contain commas on their own. This means that an excel output report is extremely messy because if a job name has a comma the job name will be divided between 2 cells. I am a bit new to programming so the only way I could think of solving this was checking to see how many fields ended up in my array delimited by commas. And if it was greater than the normal, I'd concatinate two of the fields which I know will be the job name. However, the issue is that if a job name contains 2 commas, this will not work because it is only set up to deal with 1 extra comma in the data.


I should add, the csv report that I read in is generated by another application where I can't control how it is delimited. Otherwise, I would change it to pipes or something of the sort.


Any thoughts? Below is the portion of code that deals with it:


StreamReader SR = new StreamReader(inputFile);
StreamWriter SW = new StreamWriter(outputFile);
string records;
//read headers from first line
string headers = records = SR.ReadLine();
SW.WriteLine(headers);
DateTime YesterdayAM = Convert.ToDateTime(DateTime.Now.AddDays(-1).ToShortDateString() + " 05:00:00 AM");
while ((records = SR.ReadLine()) != null)
{
if (records.Trim().Length > 0)
{
string daterecord = GetDateTimeFromStringArray(records);
if (daterecord.Length > 0)
{
DateTime recordDate = Convert.ToDateTime(daterecord);
if (recordDate >= YesterdayAM)
{
string[] checkfields = records.Split(',');
if (checkfields.Length > 13)
{
string[] replacefields = { checkfields[0], checkfields[1] + " " + checkfields[2], checkfields[3], checkfields[4], checkfields[5], checkfields[6], checkfields[7], checkfields[8], checkfields[9], checkfields[10], checkfields[11], checkfields[12] };
for (int i = 0; i < replacefields.Length; i++)
{
SW.Write(replacefields[i] + ",");
}
SW.Write(Environment.NewLine);
}
else
{
SW.WriteLine(records);
}
}
}
}
}


It's a bit hacky to do it this way, but if you can't fix the source and you know that extra commas will only appear in the one field you could do something like this:


            string[] checkfields = records.Split(',');
while (checkfields.Length > 13)
{
// concat [1] & [2] into a new array
checkfields = checkfields.Take(1)
.Concat(new string[] { string.Join("", checkfields.Skip(1).Take(2).ToArray()) })
.Concat(checkfields.Skip(3)).ToArray();
} // if it's still too long it will loop again

Or better yet:


            string[] checkfields = records.Split(',');
int extraFields = checkfields.Length - 13;
if (extraFields > 0)
{
// concat fields 1....1 + extraFields
checkfields = checkfields.Take(1)
.Concat(new string[] { string.Join("", checkfields.Skip(1).Take(extraFields).ToArray()) })
.Concat(checkfields.Skip(extraFields + 1)).ToArray();
} // avoids looping by doing it all in one go

Note: the linq statements are untested and might not be the absolute most efficient way to do it. Also all the "magic" numbers should probably be replaced with constants for maintainability.


0 commentaires:

Enregistrer un commentaire