programing tip

XElement의 InnerXml을 얻는 가장 좋은 방법은 무엇입니까?

itbloger 2020. 6. 16. 20:37
반응형

XElement의 InnerXml을 얻는 가장 좋은 방법은 무엇입니까?


body아래 코드에서 혼합 요소 의 내용을 얻는 가장 좋은 방법은 무엇입니까 ? 요소에 XHTML 또는 텍스트가 포함될 수 있지만 내용을 문자열 형식으로 원합니다. XmlElement유형은 가지고 InnerXml난 후 정확히 무엇 속성을.

작성된 코드는 거의 내가 원하는 않지만, 주변 포함 <body>... </body>내가 원하지 않는 요소를.

XDocument doc = XDocument.Load(new StreamReader(s));
var templates = from t in doc.Descendants("template")
                where t.Attribute("name").Value == templateName
                select new
                {
                   Subject = t.Element("subject").Value,
                   Body = t.Element("body").ToString()
                };

제안 된 솔루션 중 어떤 것이 가장 잘 수행되는지 확인하고 싶었 기 때문에 비교 테스트를 수행했습니다. 관심이 없으면 LINQ 메서드를 Greg가 제안한 일반 구형 System.Xml 메서드 와 비교했습니다 . 가장 느린 방법 은 가장 빠른 방법 보다 3 배 이상 느린 변형이 흥미롭고 예상했던 것과 다릅니다 .

결과는 가장 빠르거나 느리게 정렬됩니다.

  1. CreateReader-인스턴스 헌터 (0.113 초)
  2. 평범한 오래된 System.Xml-Greg Hurlman (0.134 초)
  3. 문자열 연결로 집계-Mike Powell (0.324 초)
  4. StringBuilder-Vin (0.333 초)
  5. String.Join on array-Terry (0.360 초)
  6. 배열의 문자열-Marcin Kosieradzki (0.364)

방법

20 개의 동일한 노드 ( '힌트'라고 함)가있는 단일 XML 문서를 사용했습니다.

<hint>
  <strong>Thinking of using a fake address?</strong>
  <br />
  Please don't. If we can't verify your address we might just
  have to reject your application.
</hint>

위의 초로 표시된 숫자는 20 개 노드의 "내부 XML"을 1000 회 연속 추출하여 평균 (평균) 5 회 실행 한 결과입니다. XML을로드하고 XmlDocument( System.Xml 메서드의 경우) 또는 XDocument(다른 모든 것의 경우 ) 구문 분석하는 데 걸리는 시간은 포함하지 않았습니다 .

내가 사용한 LINQ 알고리즘은 다음과 같습니다. (C #-모두 XElement"부모"를 취하고 내부 XML 문자열을 반환합니다)

리더 만들기 :

var reader = parent.CreateReader();
reader.MoveToContent();

return reader.ReadInnerXml();

문자열 연결로 집계 :

return parent.Nodes().Aggregate("", (b, node) => b += node.ToString());

StringBuilder :

StringBuilder sb = new StringBuilder();

foreach(var node in parent.Nodes()) {
    sb.Append(node.ToString());
}

return sb.ToString();

배열의 String.Join :

return String.Join("", parent.Nodes().Select(x => x.ToString()).ToArray());

배열의 String.Concat :

return String.Concat(parent.Nodes().Select(x => x.ToString()).ToArray());

노드에서 .InnerXml을 호출하기 때문에 여기에 "Plain old System.Xml"알고리즘을 표시하지 않았습니다.


결론

If performance is important (e.g. lots of XML, parsed frequently), I'd use Daniel's CreateReader method every time. If you're just doing a few queries, you might want to use Mike's more concise Aggregate method.

If you're using XML on large elements with lots of nodes (maybe 100's), you'd probably start to see the benefit of using StringBuilder over the Aggregate method, but not over CreateReader. I don't think the Join and Concat methods would ever be more efficient in these conditions because of the penalty of converting a large list to a large array (even obvious here with smaller lists).


I think this is a much better method (in VB, shouldn't be hard to translate):

Given an XElement x:

Dim xReader = x.CreateReader
xReader.MoveToContent
xReader.ReadInnerXml

How about using this "extension" method on XElement? worked for me !

public static string InnerXml(this XElement element)
{
    StringBuilder innerXml = new StringBuilder();

    foreach (XNode node in element.Nodes())
    {
        // append node's xml string to innerXml
        innerXml.Append(node.ToString());
    }

    return innerXml.ToString();
}

OR use a little bit of Linq

public static string InnerXml(this XElement element)
{
    StringBuilder innerXml = new StringBuilder();
    doc.Nodes().ToList().ForEach( node => innerXml.Append(node.ToString()));

    return innerXml.ToString();
}

Note: The code above has to use element.Nodes() as opposed to element.Elements(). Very important thing to remember the difference between the two. element.Nodes() gives you everything like XText, XAttribute etc, but XElement only an Element.


With all due credit to those who discovered and proved the best approach (thanks!), here it is wrapped up in an extension method:

public static string InnerXml(this XNode node) {
    using (var reader = node.CreateReader()) {
        reader.MoveToContent();
        return reader.ReadInnerXml();
    }
}

Keep it simple and efficient:

String.Concat(node.Nodes().Select(x => x.ToString()).ToArray())
  • Aggregate is memory and performance inefficient when concatenating strings
  • Using Join("", sth) is using two times bigger string array than Concat... And looks quite strange in code.
  • Using += looks very odd, but apparently is not much worse than using '+' - probably would be optimized to the same code, becase assignment result is unused and might be safely removed by compiler.
  • StringBuilder is so imperative - and everybody knows that unnecessary "state" sucks.

I ended up using this:

Body = t.Element("body").Nodes().Aggregate("", (b, node) => b += node.ToString());

Personally, I ended up writing an InnerXml extension method using the Aggregate method:

public static string InnerXml(this XElement thiz)
{
   return thiz.Nodes().Aggregate( string.Empty, ( element, node ) => element += node.ToString() );
}

My client code is then just as terse as it would be with the old System.Xml namespace:

var innerXml = myXElement.InnerXml();

@Greg: It appears you've edited your answer to be a completely different answer. To which my answer is yes, I could do this using System.Xml but was hoping to get my feet wet with LINQ to XML.

I'll leave my original reply below in case anyone else wonders why I can't just use the XElement's .Value property to get what I need:

@Greg: The Value property concatenates all the text contents of any child nodes. So if the body element contains only text it works, but if it contains XHTML I get all the text concatenated together but none of the tags.


// using Regex might be faster to simply trim the begin and end element tag

var content = element.ToString();
var matchBegin = Regex.Match(content, @"<.+?>");
content = content.Substring(matchBegin.Index + matchBegin.Length);          
var matchEnd = Regex.Match(content, @"</.+?>", RegexOptions.RightToLeft);
content = content.Substring(0, matchEnd.Index);

doc.ToString() or doc.ToString(SaveOptions) does the work. See http://msdn.microsoft.com/en-us/library/system.xml.linq.xelement.tostring(v=vs.110).aspx


Is it possible to use the System.Xml namespace objects to get the job done here instead of using LINQ? As you already mentioned, XmlNode.InnerXml is exactly what you need.


Wondering if (notice I got rid of the b+= and just have b+)

t.Element( "body" ).Nodes()
 .Aggregate( "", ( b, node ) => b + node.ToString() );

might be slightly less efficient than

string.Join( "", t.Element.Nodes()
                  .Select( n => n.ToString() ).ToArray() );

Not 100% sure...but glancing at Aggregate() and string.Join() in Reflector...I think I read it as Aggregate just appending a returning value, so essentially you get:

string = string + string

versus string.Join, it has some mention in there of FastStringAllocation or something, which makes me thing the folks at Microsoft might have put some extra performance boost in there. Of course my .ToArray() call my negate that, but I just wanted to offer up another suggestion.


you know? the best thing to do is to back to CDATA :( im looking at solutions here but i think CDATA is by far the simplest and cheapest, not the most convenient to develop with tho


var innerXmlAsText= XElement.Parse(xmlContent)
                    .Descendants()
                    .Where(n => n.Name.LocalName == "template")
                    .Elements()
                    .Single()
                    .ToString();

Will do the job for you


public static string InnerXml(this XElement xElement)
{
    //remove start tag
    string innerXml = xElement.ToString().Trim().Replace(string.Format("<{0}>", xElement.Name), "");
    ////remove end tag
    innerXml = innerXml.Trim().Replace(string.Format("</{0}>", xElement.Name), "");
    return innerXml.Trim();
}

참고URL : https://stackoverflow.com/questions/3793/best-way-to-get-innerxml-of-an-xelement

반응형