Description
Environment
- Pythonnet version: Latest Dev 3.0.0
- Python version: 3.6.8
- Operating System: Windows 10
Details
I was doing some performance testing on the latest PythonNet and noticed a significant slow down in the conversion of CLR objects to Python. After some digging I believe I found where the slow down is coming from.
In #1287 , the addition of the MaybeType
changed the ClassManager
dictionary cache, which ended up storing ClassBases
with MaybeType
as the key. But because the ClassManager
's GetClass
function attempts to fetch a value out of the cache using a Type
, the object is implicitly converted to a MaybeType first. This conversion and then comparison (finding obj in dictionary) leads to a major difference in performance when converting large quantities of an object to Python since we have to initialize a new object every time the cache is used.
- What commands did you run to trigger this issue? If you can provide a
Minimal, Complete, and Verifiable example
this will help us understand the issue.
Here is a unit test without any pass/fail that I was using to debug this issue. I paired it with dotTrace to see the difference in performance.
[Test]
public void Test()
{
for (int i = 0; i < 1000000; i++)
{
var testSlice = new Slice();
testSlice.ToPython();
}
}
public class Slice
{
public Dictionary<char, int> intDict = new Dictionary<char, int>();
public List<int> intList = new List<int>();
public string test = "pepe";
public decimal testDecimal = 5.5m;
public Slice()
{
for (int i = 0; i < 20; i++)
{
var key = 'A' + i;
intDict.Add((char)key, i);
intList.Add(i);
}
}
}
Current Branch Performance on 1 million calls of same object:
Without MaybeType in ClassManager (Cache dict is <Type, ClassBase>) performance:
Attempt at a fix while maintaining MaybeType
I did attempt to improve the performance by overriding the Equals()
and GetHashCode()
functions for MaybeType
which did show a little improvement and a bit more visibility into the slowdown. But unfortunately the object is already implicit converted before Equals()
because the ClassManager cache dictionary is <MaybeType, ClassBase>
public static implicit operator MaybeType (Type ob) => new MaybeType(ob);
Here was my attempt.
/// <summary>
/// Determines whether the specified <see cref="T:System.Object"/> is equal to the current <see cref="T:System.Object"/>.
/// </summary>
/// <returns>
/// true if the specified object is equal to the current object; otherwise, false.
/// </returns>
/// <param name="obj">The object to compare with the current object. </param><filterpriority>2</filterpriority>
public override bool Equals(object obj)
{
if (ReferenceEquals(null, obj)) return false;
// Compare as a Type
if (obj is Type typeObj)
{
return this.Value == typeObj;
}
// If MaybeType just compare Types
if (obj is MaybeType maybeObj)
{
return Value == maybeObj.Value;
}
// It must be false
return false;
}
/// <summary>
/// Serves as a hash function for a particular type.
/// </summary>
/// <returns>
/// A hash code for the current <see cref="T:System.Object"/>.
/// </returns>
/// <filterpriority>2</filterpriority>
public override int GetHashCode()
{
// Use Type HashCode
unchecked { return Value.GetHashCode(); }
}
It appears that the slowest part is actually the fetching of the fully qualified name, Which could be something that is changed to be lazily fetched when requested instead of during construction.