Skip to content

Phinn: On engineering a real-time phishing simulation proxy

Advanced phishing attacks are becoming increasingly commonplace with tools that allow attackers to harvest credentials, bypass Two-factor authentication (2FA), as well as run automated post-exploit scripts the instant you enter your credentials. This post takes a look at our journey towards releasing Phinn, the real-time phishing simulation proxy that sits at the core of the PhishDeck phishing simulation platform.

The Problem

In recent years we have seen a dramatic surge and shift in the phishing landscape that we have not seen in a very long time. We now have open-source tools that make it far more accessible than ever before for attackers to set up phishing websites that are virtually indistinguishable from their original counterparts both visually and, more importantly, in their behaviour.

With the adoption of 2FA (albeit somewhat slow) we increase the barrier of entry for attackers. Security vendors have been innovative with their drive towards WebAuthn/U2F solutions, however we see slower adoption in phishing simulation platforms to actively reflect advanced phishing and social engineering attacks to gauge the former’s effectiveness, particularly with WebAuthn being clearer and more precise. Anyone that operates in, or with, a security blue-team (i.e. defense) knows that time-saving and effective solutions are imperative to act as a counterbalance that they can rely on and use.

The security community has been phenomenal at open-sourcing multiple standalone tools and sharing their knowledge through technical articles as well as conference talks around real-time phishing. The primary objective of these tools are to act as standalone tools that form part of a red-team’s, security consultant’s or pentester’s toolkit during security engagements — harvesting user credentials, sessions and 2FA/MFA tokens as well as run post-exploit automation scripts.

As a phishing simulation platform, PhishDeck’s scope is different, and therefore has different requirements that need to be met, be that performance, scalability or safety. We could have simply turned to popular open-source real-time phishing tools such as EvilGinx2 , Modlishka or Muraena and be done with it. However, since PhishDeck is not designed to run red-team exercises or penetration tests (or to be used maliciously), we wanted to make sure that our product is as safe as possible to use for both the person being targeted, and the organization running the simulation.

On top of this, being the first phishing simulation vendor to incorporate this type of simulation, we wanted to make sure using it was easy to use. This means it needed to be faster, more feature-rich and tightly-knit with the rest of the PhishDeck platform. Users need to be able to be able to simulate real-time phishing campaigns without having to host their own infrastructure, manage domains, or fiddle with HTML, CSS and JavaScript code.

Humble beginnings

When we initially started working on a real time phishing simulation proxy we had explored going down the route of developing a hard-fork of EvilGinx2 for a number of reasons.

  1. It’s open-source license was permissive for private and commercial use;
  2. A lot of plumbing around service configuration (what EvilGinx2 refers to as phishlets), and Let’s Encrypt certificate generation was already implemented;
  3. We could use and expand on community-written services, initially thinking it would develop into a fairly large template library without breaking compatibility with our hard-fork;
  4. Merging with upstream from time to time would not be too difficult as the codebase was fairly small and updates weren’t too frequent (e.g. daily).

We went ahead and developed a heavily stripped down version of EvilGinx2. Implementing some key functionality around client-side code injection (which the project now supports) and security validation via JWT tokens for our users. We also removed a lot of the built-in functionality–no victim sessions (i.e. session database), no interactive terminal, and no credential harvesting. Only the certificate generator and the HTTP phishing proxy.

As time went on, we started hitting some notable problems that were going to become more and more difficult to overcome, effectively making the hard-fork model unsustainable.

  1. Using the HTTP-01 challenge type for TLS certificate generation heavily limited what type of services we could proxy. Our aim was to be able to proxy services that operate on wildcard subdomains (e.g. *.slack.com) that are exceptionally popular and make for very convincing phishing simulations. Achieving this requires us to move to the DNS-01 challenge type, but also requires the service configuration and core proxy to be “wildcard-aware” which is no small feat.
  2. Community service configurations (i.e. phishlets) didn’t pick up at the frequency that we had originally anticipated. We really wanted to contribute more here, but found ourselves spending too much time on the proxy and very little on adding new services.
  3. We ended up deviating heavily from upstream, including the core proxy. Merging ended up being a very arduous and manual process, particularly with breaking schema changes around service configurations.

Key takeaway for us here, as most people in the software industry already know, is that maintaining a hard-fork is difficult. With that said, if we had to go back and face the same decision, we would have proceeded in the same direction. The EvilGinx2 hard-fork was a fantastic proof-of-value and really allowed us to prototype our ideas in the context of a larger phishing simulation platform and gauge our market fit better.

Enter Phinn, our real time phishing simulation proxy

With a working prototype and a handful of lessons learnt, we went back to the drawing board and fleshed out the requirements for a new in-house proxy. One that is tightly integrated with PhishDeck. We love naming things and our namespace is ripe for all sorts of word play, so we named our new real-time phishing proxy, Phinn.

We came up with a few notable requirements that we had to achieve in order to provide value as well as scale our platform to the general public.

  • Wildcard DNS support. We want to be able to support wildcard domains to be able to simulate realistic, targeted attacks against a variety of services.
  • Performant. Since we are effectively mirroring/replaying traffic, we are duplicating our response times, making it less likely for Targets to engage with the phishing website and making it costlier for us to operate. Additionally, each PhishDeck user on average will translate to tens or hundreds of Targets, resulting in substantial increase of our proxy traffic when contrasted to our platform traffic.
  • Safety. The objective for Phinn is to only proxy traffic for legitimate Targets that form part of a verified phishing simulation Campaign and to track specific events with the least amount of event metadata. We do not want to process, store or use any user sessions or credentials.

Design

The scope of the initial proxy was to come up with a crude real-time phishing proxy implementation to gauge the practical soundness of the idea.

We managed to get a working prototype going over HTTP (no TLS) within a day or so, applied some synthetic HTTP load and monitored CPU/memory usage. Everything fell within our required parameters and so we started laying out the tracks for what version 1.0 would look like. We had to future-proof the design as much as possible as we knew we would not be implementing all features right off the bat.

We needed to lay the foundations that would serve us for future improvements over the coming years. The first key design artefact we committed to was the high-level sequence of what a successful proxied request behaves like. This is helpful as it helps better visualise the traffic replay aspect of the proxy and allows us to break down the implementation into smaller segments.

We also needed to factor-in how the phishing proxy would operate with the rest of the PhishDeck platform. It needs to perform as one holistic platform and needs to propagate data from the PhishDeck Campaign, to the Target via the Campaign’s phishing email, to Phinn, all the way back to the same PhishDeck Campaign (full circle).

After some tweaking and testing we settled on the following set of components. The following diagram may be quite heavy, but I encourage you to take some time and grasp it segment by segment, using the flow markers (i.e. numbers) as your guide.

Note – The way we designed our Callback tokens (step 5 above) are similar to how opaque tokens work in certain OAuth flows. We pack the metadata internally and only push an opaque Callback token (i.e. reference) to the Target. This suits our needs well as it avoids us ever having to pass sensitive Target metadata (e.g. email) over the wire.

In the platform, we are then able to show interesting metrics and charts for that specific Campaign, specifically the Linked Clicked and Credentials Entered data points, which are two distinct callback tokens that fire in the Target’s browser depending on their activity.

These designs were originally a lot less neat, primarily consisting of whiteboard drafts formalized into interim architecture design records (ADRs). Having such documentation in place early can be invaluable, particularly for software projects with a well-defined problem definition upfront. They act as our north star and allow us to break down seemingly complicated problems into multiple smaller problems to be solved.

Building a real-time phishing simulation proxy in Rust

We always eyed Rust as the primary candidate for the new phishing proxy. In the security world, being able to have performance and safety in the same box is rare and Rust fits that to a tee. The largest single defining factor establishing the move to Rust for the core proxy was Cloudflare’s announcement of their low-output-latency HTML parser and rewriter named, (LOL)HTML .

With the current EvilGinx2 hard-fork, we would hit response time spikes during load tests, particularly when handling concurrent users on lower-spec compute instances. One cause was related to how our EvilGinx2 fork would handle traffic transformation, specifically, response bodies (EvilGinx2 has improved this substantially since).

We were constrained to some computationally heavy regular expressions to perform all sorts of checks multiple times per request, resulting in multiple buffers being cloned and increased time spent on traffic processing. Take the following original snippet that we had at the time (there’s no need to comb through it). This snippet would iterate over every traffic filter (which we have now managed to do away with entirely) and perform a number of regular expressions and string replacements.

for _, sf := range sfs {
	reS := sf.regexp
	replaceS := sf.replace
	phishHostname, _ := p.replaceHostWithPhished(combineHost(sf.subdomain, sf.domain))
	phishSub, _ := p.getPhishSub(phishHostname)
	reS = strings.Replace(reS, "{hostname}", regexp.QuoteMeta(combineHost(sf.subdomain, sf.domain)), -1)
	reS = strings.Replace(reS, "{subdomain}", regexp.QuoteMeta(sf.subdomain), -1)
	reS = strings.Replace(reS, "{domain}", regexp.QuoteMeta(sf.domain), -1)
	reS = strings.Replace(reS, "{search}", regexp.QuoteMeta(sf.regexp), -1)
	replaceS = strings.Replace(replaceS, "{hostname}", phishHostname, -1)
	replaceS = strings.Replace(replaceS, "{subdomain}", phishSub, -1)
	replaceS = strings.Replace(replaceS, "{hostname_regexp}", regexp.QuoteMeta(phishHostname), -1)
	replaceS = strings.Replace(replaceS, "{subdomain_regexp}", regexp.QuoteMeta(phishSub), -1)
	replaceS = strings.Replace(replaceS, "{search}", regexp.QuoteMeta(sf.replace), -1)
	phishDomain, ok := p.cfg.GetSiteDomain(pl.Name)
	// ...
}

That is not all — there is also a key point where we need to inject a <script> tag with our client-side code injection mentioned earlier, that needs to be injected right after the <head> tag to be interpreted by the browser as early as possible.

A naïve approach is seeking the response body for the <head> tag, splitting by that pattern matched index and offsetting the tail slice by the size of the client-side code injection and ultimately stitching all three slices together. It’s tough to grasp with words, so here’s an attempt at visualising it.

When receiving an HTTP response, we quickly parse the start and end of the <head> tag. With that we do two things, (a) split the buffer into a head and tail and (b) copy over the <head> tag and its children elements to a local temporary buffer.

In our local temporary buffer, we figure where to insert our client-side code. Since we are explicitly searching for the <head> tag, then we take an educated guess and insert our client-side code on the 6th index (since <head> ranges from 0..5) and push the rest of the array to the right.

We insert the client-side to do two things, (1) trigger a callback when the DOM is loaded and (2) hook a callback to any password inputs on the page. Each callback has a pointer that is sent to the PhishDeck platform to notify the user that the event occurred. With the client-side code inserted, we can then stitch all three components together and push it to the client to execute in their browser.

With Rust and (LOL)HTML, we boil all of this down to a far simpler solution. We feed the response buffer directly to (LOL)HTML, which internally takes ownership of the buffer and simply configure that for any <head> tag, prepend its children with a new <script> tag containing our client-side code.

let mut rewriter = HtmlRewriter::try_new(
    Settings {
        element_content_handlers: vec![
            element!("head", |el| {
                el.prepend(&format!(
                    r#"<script type="text/javascript">{}</script>"#,
                    &client_code_injection),
                    ContentType::Html
                );
                Ok(())
            }),
            // ...
        ],
        ..Settings::default()
    },
    |c: &[u8]| output.extend_from_slice(c)
).unwrap();
rewriter.write(&bytes).unwrap();
rewriter.end();

Since (LOL)HTML is an HTML parser and rewriter, the code required is concise, fast and achieved in fewer lines of code, Win! There are quite a few other rewrites that we perform but the idea is effectively the same.

So how does (LOL)HTML fit into the bigger picture? Let us take a look at the following decomposition view of Phinn to understand how the internal components are laid out. Decomposition views (or at least my version of them) are helpful for us, as they allow us to take a peak at system internals without too much nitty-gritty detail.

Most of the magic happens in the on_request_svc service handler that we register. The objective of the service handler here is to accomplish the following.

  1. Receive an HTTP Request along with safe pointers to our global configuration and services;
  2. Rewrite request traffic and send it to the desired website;
  3. Receive and transform the response back, including Headers, Cookies and response Body;
  4. Send the mutated response to the user’s browser.

Phinn is powered by hyper-rs for our networking layer and hyper-tls to handle TLS context. We originally tried easier-to-use HTTP libraries such as reqwest, however combining hyper-rs and reqwest interchangeably proved to be more difficult and removed certain low-level control that we needed.

This pattern is particularly neat, as for the most part, we simply pass a single mutable Request and Response body byte buffer across our entire request lifetime.

This also allows us to handle more than just HTML rewriting in the future, including binary, image and so forth (we have some interesting ideas around this) as well as removing some overhead of casting bytes slices to strings and back to byte slices many times per-request. A minor digression and life lesson on this last point.

At some point, we needed to perform some trivial pattern replacement (i.e. replace abc -> xyz) similar to what we explained above but for more specific transformations, such as transforming certain key OAuth parameters (e.g. redirect_uri) for specific services to hijack their OAuth flow. In Rust’s standard library, this doesn’t seem to be possible directly on a byte slice (&[u8]). Instead, you are required to parse your byte slice as UTF-8 &str (lossy or otherwise ) and perform your regular string replacement there.

At first, we did not quite like that. We wanted to keep working with byte-slices all throughout but did not have a way to. Our theory was that &[u8] to &str conversion is so miniscule, that you may just convert to &str and perform a pattern replacement as you normally would. Regardless, we gave a go at implementing a small u8_replace, function that you can review in this playground — we then ran it through a makeshift test bench to see how it fared against the standard library’s &str replace.

Note – This isn’t the most scientific benchmarking method. We recommend using profilers such as perf and tools such as flamegraph to really understand performance implications during runtime.

The makeshift test bench was quite simple. Gather some sample HTML files, and place execution timers for (a) lol_html, (b) the standard library’s &str replace and (c) our &[u8] replace variant. The results were passed both in debug and release builds to measure the difference and the following are their respective results.

In the worst case with a debug profile, we see roughly a ~1000% decrease in performance. Not quite the smartest algorithm I have ever concocted. Although, I guess I did affect the performance by an order of magnitude–just in the wrong direction.

What is interesting to note is the exponential performance gain on the exact same code, except this time running through a release profile. It would be interesting to dig into some of the compiler optimizations that are going on here to achieve this, maybe even optimize some other compiler flags to strip it down further. Our key takeaway here is, you are most probably not going to be faster than the standard library. Lesson learnt and a good laugh had.

Given that our hypothesis was not sustained, we moved back to using good ol’ &str replace and carried on our journey. At this stage we had a fully functioning real-time phishing proxy and began writing some service configurations that we want to start providing to our platform.

Note – We make no mention of the DNS nameserver and Let’s Encrypt client in this article. That was developed first and already fully functioning at this point, however we decided to omit it from this post for an easier narrative. We would like to dedicate a separate post for that.

Let us finally take a look at a demo of Phinn in action. This time, simulating a phishing website for Microsoft Office 365. In this example, Phinn was run locally and bound to *.example.com for ease-of-use (editing local hosts file to point to 127.0.0.1). This is an example of one of the more complex services that Phinn can support — requiring a lot of features in the internal proxy as well as proxying over multiple service configurations as opposed to just a single one.

On the left-hand side is an Android emulator running Microsoft’s authenticator app as our 2FA push notification. On the right-hand side is Phinn, acting as a phishing website for Office 365 showing a full multi-factor authentication (MFA) bypass, OAuth hijack and successful login. It all worked like magic and taking a step back, this was one of the defining moments for us.

We love being able to demonstrate the impact of real-time phishing on services like this one as it demonstrates a fully-functioning user authentication flow including multi-factor authentication and OAuth, then seamlessly transitioning as an authenticated (if a legitimate authentication session already exists). With the exception of the URL it is virtually impossible to tell the difference as a Target!

It is important to note that this is not always the result. In fact, we do not even require that the user is able to functionally login within the phishing website — that almost goes beyond the scope of our platform. So long as we are far enough to know a Target entered some text in a password field and 2FA token, real attackers can automatically orchestrate post-exploit scripts on the real website instantly and with ease with tools such as necrobrowser .

This is where we currently stand. There are still some interesting improvements we want to implement such as some basic performance-related changes around handling compression algorithms. For instance, right now we have to explicitly forward an Accept-Encoding: identity header to receive uncompressed response bodies. In the future, the proxy would support compression algorithms, including brotli, gzip and so forth.

It was not all smooth sailing. We did hit some teething issues with hyper-rs in production related to connection pooling that we did not experience during testing and rollout. Despite that, this has been a great opportunity for growth and learning for us. It has been a real pleasure working with Rust, not only as a language and stack of choice but also as a community. The number of mentors we had the pleasure of working with (directly or otherwise) is too long to list. Increasing the adoption of Rust in the information security community is a long-term objective of ours that we would love to contribute towards.

We are extremely happy with the progress and being able to share it with you all. We also want to slowly flesh out the wildcard-aware (i.e. understanding how to proxy wildcard traffic safely) aspect of Phinn to be able to support even more interesting services in the future.

Parting thoughts

This post is primarily a story on how we approached a technical solution. The focus is not necessarily on what we did but rather how we approached it and why we did it, to help others to possibly develop a similar mindset or discipline should they find it helpful. With that in mind, here are some key takeaways we hope you may take from this piece.

It is always important to prototype a solution to your problem as soon as you can. It might sound obvious and a lot of work but, as the late Richard Hamming would put it, the aim of back-of-the-napkin demonstration is not to get it exactly right, but rather to quickly prove or disprove an idea.

Avoid slowly turning your prototype into the final solution unless you absolutely must do so. A prototype albeit extremely useful, will subsequently narrow your solution into incremental progress, as opposed to exponential progress that will more likely come with a fully fledged solution. The objective of your prototype should ultimately be to gain insight.

Thoughtful design ahead-of-time pays dividends in the long run. You may be tempted to dive head-first into a problem with code however you will often find that you lose focus of the bigger picture or overfit your solution early on. Subsequently reducing the flexibility of your solution to improve. If you do not believe me, try applying this technique to a side-project of yours and see if it drastically improves your project’s longevity.

Lastly, remember to have fun with your solutions. The days I learn most occur when I do not focus about tight deadlines or external pressures (few and far between) but instead, focus on thinking like a child. Being infinitely curious about a problem and experimenting with new ideas and assumptions, sharing them with people in the community and being okay with potentially looking silly in the process (often ending with a laugh either way).

Conclusion

This is really the beginning for Phinn as well as PhishDeck. There are still a lot of functional and non-functional improvements that we want to ship for the future. Having to develop such a tool to also scale along with a platform makes this even more exciting for us as it keeps bringing new challenges for us to solve and share.

If you made it this far, we hope you enjoyed the content and would love for you to stick around for more. We omitted a lot of performance data from this technical piece. If you would like to see a separate article diving deep into some of the performance improvements and some of the lessons learnt, let us know on Twitter .

If you want to take Phinn for a spinn (sic), sign up for a free 14-day trial of PhishDeck and set-up your own phishing simulation Campaign in under two minutes.


Acknowledgements

A lot of what we showed in this post could not have been achieved at the pace that it was achieved without the invaluable contribution of Kuba Gretzky (@mrgretzky ) on the EvilGinx2 project. Phinn is very much inspired by Kuba’s work and has been something we have been closely following (albeit quietly) since the original Nginx implementation of EvilGinx1. We hope to be able to slowly open-source parts of Phinn in such a way that may benefit the broader community.

We would also like to thank Cloudflare for open-sourcing and explaining (LOL)HTML. It’s a fantastic Rust library and helped push us to developing Phinn.

Lastly but most importantly, a heartfelt thank you to our peers in the security space who have kindly peer-reviewed our work, given their insights and overall been a key part in the release of this technical post.


About PhishDeck

PhishDeck is a phishing simulation platform designed to make it easy for you to simulate advanced phishing attacks across your organisation, helping you build better defences, respond to phishing threats faster and more effectively, all while providing you with actionable insights to help you continuously assess the effectiveness of your security awareness programme.