8000 WIP: Update LoCoMo evaluation scripts by edwinyyyu · Pull Request #651 · MemMachine/MemMachine · GitHub
[go: up one dir, main page]

Skip to content

Conversation

@edwinyyyu
Copy link
Contributor
@edwinyyyu edwinyyyu commented Dec 1, 2025

Purpose of the change

Update LoCoMo evaluation scripts for improved performance and to work with new (internal) APIs.

Description

WIP. Uses internal APIs to test long-term memory in isolation.

Type of change

[Please delete options that are not relevant.]

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Refactor (does not change functionality, e.g., code style improvements, linting)
  • Documentation update
  • Project Maintenance (updates to build scripts, CI, etc., that do not affect the main project)
  • Security (improves security without changing functionality)

How Has This Been Tested?

WIP

  • Unit Test
  • Integration Test
  • End-to-end Test
  • Test Script (please provide)
  • Manual verification (list step-by-step instructions)

Test Results: [Attach logs, screenshots, or relevant output]

Checklist

[Please delete options that are not relevant.]

  • I have signed the commit(s) within this pull request
  • My code follows the style guidelines of this project (See STYLE_GUIDE.md)
  • I have performed a self-review of my own code
  • I have commented my code
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added unit tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • Any dependent changes have been merged and published in downstream modules
  • I have checked my code and corrected any misspellings

Maintainer Checklist

  • Confirmed all checks passed
  • Contributor has signed the commit(s)
  • Reviewed the code
  • Run, Tested, and Verified the change(s) work as expected

@edwinyyyu edwinyyyu force-pushed the update_evaluation branch 7 times, most recently from 33fc9dc to 3093e29 Compare December 3, 2025 22:11
@edwinyyyu edwinyyyu force-pushed the update_evaluation branch 3 times, most recently from 384d4cd to 41f8250 Compare December 4, 2025 20:02
@edwinyyyu edwinyyyu requested a review from tomw-mv December 5, 2025 00:16
Signed-off-by: Edwin Yu <edwinyyyu@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

0