sophia-martinez's comments

sophia-martinez · 2025-12-31T00:28:44 1767140924

We recently ran an experiment comparing real M&A due diligence documents (technical overviews, security controls, retention policies, and integration requirements) to see what inconsistencies and gaps surfaced automatically.

What stood out wasn’t missing documentation, but conflicting statements across documents that would normally pass a manual review. Things like architecture drift, unclear retention timelines, and control enforcement gaps that create post-close risk.

This write-up walks through a real diligence-style example, the kinds of issues that surfaced, and where automated comparison helped (and where human judgment still mattered).

Curious how others approach diligence at scale and whether automation has helped or hurt your process.

sophia-martinez · 2025-12-15T21:22:18 1765833738

I’m building something that keeps finding gaps I didn’t realize were gaps.

It’s called Riftur, a gap analysis tool that compares two documents and highlights gaps, missing requirements, and inconsistencies. The interesting part for us has been getting the system to understand intent instead of just keywords, so it can flag partial matches and subtle gaps rather than just “present / not present.”

Still early, but it’s been useful in ways we didn’t fully expect. If anyone’s curious, you can demo it out here: https://riftur.com I'm happy to hear thoughts or learn how others handle this kind of review work.

sophia-martinez · 2025-12-11T17:06:51 1765472811

We built Riftur to automate the repetitive work involved in document gap analysis. Many teams still compare requirements to drafts manually—switching between two documents, highlighting mismatches, and tracking gaps in spreadsheets. This seemed like something AI could help with, so we experimented with a way to handle that comparison step transparently and reliably.

Riftur takes a requirement set (proposal instructions, audit criteria, evaluation rubrics, regulatory checklists, internal standards, etc.), interprets each item, and compares it to an uploaded document. It highlights what appears missing, partially addressed, or inconsistent, reducing the need for side-by-side manual review.

The tool isn’t limited to compliance: it can also map internal engineering guidelines to existing artifacts, review technical documentation for alignment, check process adherence, ensure code is following set guidelines or evaluate structured submissions in education and training contexts.

The current version is live, and we’re trying to understand how well the approach generalizes across different document types. Happy to answer questions about the underlying method, tradeoffs, or implementation details.

sophia-martinez · 2025-09-04T15:14:38 1756998878

I've been tinkering with this idea for a while and built Phantomshift! I just published a breakdown — with a full demo video — showing how it plays out in the Recon phase of a red team mission. Instead of spitting out canned “try sudo -l” advice, the AI stays in your terminal, tracks what you’ve already done, and nudges you toward the next logical move. It’s weirdly fun watching it act more like a sparring partner than a chatbot!

I wrote up the story on my Medium blog if you want to check it out, linked on here.

Curious what you think — where would you want an AI to jump in and save you time during an op?

sophia-martinez · 2025-08-06T20:19:43 1754511583

You’re working through a red team lab or real engagement. You’ve handled recon, started poking at privilege escalation paths, and hit a few dead ends. You know there’s more to explore—but now you’re second-guessing the next move.

Most training platforms don’t help with that part. They show you what can be done, but not what makes sense right now, in your specific context. No feedback loops, no real-time reasoning—just flags or checklists.

We started exploring how an AI assistant might help newer operators train the way real ops flow. Not by walking through playbooks, but by surfacing options that adapt to what’s already happened in the shell. A kind of guided sparring partner that builds decision-making muscle, not just technical recall.

This post walks through what we’ve been thinking, what PhantomShift is doing so far, and how this kind of AI could fit into operator development. Would love thoughts from anyone working in red team education, CTF training, or onboarding new security hires!

sophia-martinez · 2025-07-30T19:51:37 1753905097

We’ve been thinking about what federal procurement might look like if it actually prioritized working software over perfect paperwork. Picture this: instead of writing a 100-page RFP response, you ship a functional prototype. Instead of needing a SAM registration and three layers of past performance just to be heard, you’re asked a simple question — “Can you solve this?” — and judged by what you build, not how well you mirror government jargon.

We wrote a piece exploring what that world might look like — where contracting officers are treated as users, proposals are replaced with real feedback loops, and speed to deploy matters more than speed to comply.

It’s part frustration, part thought experiment, and part call to start building like the system already works that way.

Would love thoughts from folks who’ve built for gov, around gov, or despite it.

sophia-martinez · 2025-07-29T21:48:02 1753825682

Good question — we’re thinking about it as a way to support operator training and tactical development.

It’s not about giving you the answer, but about exposing the range of reasonable next steps in a given situation. Because the system tracks prior commands and environment state, its suggestions can help newer operators learn how context influences decision-making — not just what can be done, but why certain moves make more sense at that moment.

In that sense, it’s more like a guided sparring partner than a shortcut! Still early days, but we’re curious how something like this could help folks build intuition under pressure. Got any thoughts?

sophia-martinez · 2025-07-29T18:35:43 1753814143

You’re midway through a pentesting engagement. Recon’s wrapped, and a couple of privilege escalation paths have already failed. You flip over to ChatGPT hoping for something useful, but it offers the usual: SUID binaries, kernel exploits, and weak folder permissions. It doesn’t know your host, the tools you've used, or what phase of the operation you're in—and that’s the real problem.

We started tinkering with a question: what would it take to make an assistant that thinks more like an operator under pressure? One that tracks what’s actually happening in your shell without having to copy and paste over and over again. It watches the flow of your session, reasons over what you’ve already done, and suggests next steps that are grounded in your actual operation, not pulled from some generic playbook.

This write-up shares what we’ve learned so far, what didn’t work, and where we think things could go. Would love feedback from folks building or breaking in the same space.